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Preface  to  the  Second  Edition 


We  are  happy  that  Springer  Verlag  asked  us  to  prepare  the  second  edition  of  our 
textbook  Analysis  for  Computer  Scientists.  We  are  still  convinced  that  the  algo¬ 
rithmic  approach  developed  in  the  first  edition  is  an  appropriate  concept  for  pre¬ 
senting  the  subject  of  analysis.  Accordingly,  there  was  no  need  to  make  larger 
changes. 

However,  we  took  the  opportunity  to  add  and  update  some  material.  In  partic¬ 
ular,  we  added  hyperbolic  functions  and  gave  some  more  details  on  curves  and 
surfaces  in  space.  Two  new  sections  have  been  added:  One  on  second-order  dif¬ 
ferential  equations  and  one  on  the  pendulum  equation.  Moreover,  the  exercise 
sections  have  been  extended  considerably.  Statistical  data  have  been  updated  where 
appropriate. 

Due  to  the  essential  importance  of  the  MATLAB  programs  for  our  concept,  we 
have  decided  to  provide  these  programs  additionally  in  Python  for  the  users’ 
convenience. 

We  thank  the  editors  of  Springer,  especially  Simon  Rees  and  Wayne  Wheeler, 
for  their  support  during  the  preparation  of  the  second  edition. 

Innsbruck,  Austria  Michael  Oberguggenberger 

March  2018  Alexander  Ostermann 
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Preface  to  the  First  Edition 


Mathematics  and  mathematical  modelling  are  of  central  importance  in  computer 
science.  For  this  reason  the  teaching  concepts  of  mathematics  in  computer  science 
have  to  be  constantly  reconsidered,  and  the  choice  of  material  and  the  motivation 
have  to  be  adapted.  This  applies  in  particular  to  mathematical  analysis,  whose 
significance  has  to  be  conveyed  in  an  environment  where  thinking  in  discrete 
structures  is  predominant.  On  the  one  hand,  an  analysis  course  in  computer  science 
has  to  cover  the  essential  basic  knowledge.  On  the  other  hand,  it  has  to  convey  the 
importance  of  mathematical  analysis  in  applications,  especially  those  which  will  be 
encountered  by  computer  scientists  in  their  professional  life. 

We  see  a  need  to  renew  the  didactic  principles  of  mathematics  teaching  in 
computer  science,  and  to  restructure  the  teaching  according  to  contemporary 
requirements.  We  try  to  give  an  answer  with  this  textbook  which  we  have  devel¬ 
oped  based  on  the  following  concepts: 

1.  algorithmic  approach; 

2.  concise  presentation; 

3.  integrating  mathematical  software  as  an  important  component; 

4.  emphasis  on  modelling  and  applications  of  analysis. 

The  book  is  positioned  in  the  triangle  between  mathematics,  computer  science  and 
applications.  In  this  field,  algorithmic  thinking  is  of  high  importance.  The  algo¬ 
rithmic  approach  chosen  by  us  encompasses: 

a.  development  of  concepts  of  analysis  from  an  algorithmic  point  of  view; 

b.  illustrations  and  explanations  using  MATLAB  and  maple  programs  as  well  as  Java 
applets; 

c.  computer  experiments  and  programming  exercises  as  motivation  for  actively 
acquiring  the  subject  matter; 

d.  mathematical  theory  combined  with  basic  concepts  and  methods  of  numerical 
analysis. 

Concise  presentation  means  for  us  that  we  have  deliberately  reduced  the  subject 
matter  to  the  essential  ideas.  For  example,  we  do  not  discuss  the  general  conver¬ 
gence  theory  of  power  series;  however,  we  do  outline  Taylor  expansion  with  an 
estimate  of  the  remainder  term.  (Taylor  expansion  is  included  in  the  book  as  it  is  an 
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indispensable  tool  for  modelling  and  numerical  analysis.)  For  the  sake  of  read¬ 
ability,  proofs  are  only  detailed  in  the  main  text  if  they  introduce  essential  ideas  and 
contribute  to  the  understanding  of  the  concepts.  To  continue  with  the  example 
above,  the  integral  representation  of  the  remainder  term  of  the  Taylor  expansion  is 
derived  by  integration  by  parts.  In  contrast,  Lagrange’s  form  of  the  remainder  term, 
which  requires  the  mean  value  theorem  of  integration,  is  only  mentioned.  Never¬ 
theless  we  have  put  effort  into  ensuring  a  self-contained  presentation.  We  assign  a 
high  value  to  geometric  intuition ,  which  is  reflected  in  a  large  number  of 
illustrations. 

Due  to  the  terse  presentation  it  was  possible  to  cover  the  whole  spectrum  from 
foundations  to  interesting  applications  of  analysis  (again  selected  from  the  view¬ 
point  of  computer  science),  such  as  fractals,  L-systems,  curves  and  surfaces,  linear 
regression,  differential  equations  and  dynamical  systems.  These  topics  give  suffi¬ 
cient  opportunity  to  enter  various  aspects  of  mathematical  modelling. 

The  present  book  is  a  translation  of  the  original  German  version  that  appeared  in 
2005  (with  the  second  edition  in  2009).  We  have  kept  the  structure  of  the  German 
text,  but  took  the  opportunity  to  improve  the  presentation  at  various  places. 

The  contents  of  the  book  are  as  follows.  Chapters  1-8,  10-12  and  14-17  are 
devoted  to  the  basic  concepts  of  analysis,  and  Chapters  9,  13  and  18-21  are  ded¬ 
icated  to  important  applications  and  more  advanced  topics.  The  Appendices  A  and 
B  collect  some  tools  from  vector  and  matrix  algebra,  and  Appendix  C  supplies 
further  details  which  were  deliberately  omitted  in  the  main  text.  The  employed 
software,  which  is  an  integral  part  of  our  concept,  is  summarised  in  Appendix  D. 
Each  chapter  is  preceded  by  a  brief  introduction  for  orientation.  The  text  is  enriched 
by  computer  experiments  which  should  encourage  the  reader  to  actively  acquire  the 
subject  matter.  Finally,  every  chapter  has  exercises,  half  of  which  are  to  be  solved 
with  the  help  of  computer  programs.  The  book  can  be  used  from  the  first  semester 
on  as  the  main  textbook  for  a  course,  as  a  complementary  text  or  for  self-study. 

We  thank  Elisabeth  Bradley  for  her  help  in  the  translation  of  the  text.  Further,  we 
thank  the  editors  of  Springer,  especially  Simon  Rees  and  Wayne  Wheeler,  for  their 
support  and  advice  during  the  preparation  of  the  English  text. 


Innsbruck,  Austria 
January  2011 


Michael  Oberguggenberger 
Alexander  Ostermann 
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Numbers 


The  commonly  known  rational  numbers  (fractions)  are  not  sufficient  for  a  rigor¬ 
ous  foundation  of  mathematical  analysis.  The  historical  development  shows  that  for 
issues  concerning  analysis,  the  rational  numbers  have  to  be  extended  to  the  real  num¬ 
bers.  For  clarity  we  introduce  the  real  numbers  as  decimal  numbers  with  an  infinite 
number  of  decimal  places.  We  illustrate  exemplarily  how  the  rules  of  calculation  and 
the  order  relation  extend  from  the  rational  to  the  real  numbers  in  a  natural  way. 

A  further  section  is  dedicated  to  floating  point  numbers,  which  are  implemented 
in  most  programming  languages  as  approximations  to  the  real  numbers.  In  particular, 
we  will  discuss  optimal  rounding  and  in  connection  with  this  the  relative  machine 
accuracy. 


1 .1  The  Real  Numbers 

In  this  book  we  assume  the  following  number  systems  as  known: 

N  =  {1,  2,  3,  4,  . . .}  the  set  of  natural  numbers; 

No  =  N  U  {0}  the  set  of  natural  numbers  including  zero; 

Z  =  {. . . ,  —3,  —2,  —  1,  0,  1,  2,  3,  . . .}  the  set  of  integers; 

Q  =  {y  ;  k  e  Z  and  /igHJ  the  set  of  rational  numbers. 

Two  rational  numbers  |  and  ^  are  equal  if  and  only  if  km  =  in.  Further  an  integer 

k  e  Z  can  be  identified  with  the  fraction  y  e  Q.  Consequently,  the  inclusions  N  c 
Z  c  Q  are  true. 
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1  Numbers 


Let  M  and  N  be  arbitrary  sets.  A  mapping  from  M  to  N  is  a  rule  which  assigns  to 
each  element  in  M  exactly  one  element  in  A.  A  mapping  is  called  bijective ,  if  for 
each  element  n  e  N  there  exists  exactly  one  element  in  M  which  is  assigned  to  n. 


Definition  1.1  Two  sets  M  and  N  have  the  same  cardinality  if  there  exists  a  bijective 
mapping  between  these  sets.  A  set  M  is  called  countably  infinite  if  it  has  the  same 
cardinality  as  N. 


The  sets  N,  Z  and  Q  have  the  same  cardinality  and  in  this  sense  are  equally  large. 
All  three  sets  have  an  infinite  number  of  elements  which  can  be  enumerated.  Each 
enumeration  represents  a  bijective  mapping  to  N.  The  countability  of  Z  can  be  seen 
from  the  representation  Z  =  {0,  1,  —1,2,  —2,  3,  —3,  ...}.To  prove  the  countability 
of  Q,  Cantor’s  diagonal  method  is  being  used: 


1.2  3.4 

1  1  1  1 

/  /  / 

12  3  4 

2  2  2  2 

/  / 

12  3  4 

3  3  3  3 

/ 

12  3  4 

4  4  4  4 


The  enumeration  is  carried  out  in  direction  of  the  arrows,  where  each  rational  number 
is  only  counted  at  its  first  appearance.  In  this  way  the  countability  of  all  positive 
rational  number  (and  therefore  all  rational  numbers)  is  proven. 

To  visualise  the  rational  numbers  we  use  a  line,  which  can  be  pictured  as  an 
infinitely  long  ruler,  on  which  an  arbitrary  point  is  labelled  as  zero.  The  integers  are 
marked  equidistantly  starting  from  zero.  Likewise  each  rational  number  is  allocated 
a  specific  place  on  the  real  line  according  to  its  size,  see  Fig.  1.1. 

However,  the  real  line  also  contains  points  which  do  not  correspond  to  rational 
numbers.  (We  say  that  Q  is  not  complete.)  For  instance,  the  length  of  the  diagonal  d 
in  the  unit  square  (see  Fig.  1.2)  can  be  measured  with  a  ruler.  Yet,  the  Pythagoreans 
already  knew  that  d1 2  =  2,  but  that  d  =  \fl  is  not  a  rational  number. 


-2 


F 

0 


i  i 

3  2 
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Fig.  1.1  The  real  line 


1  We  will  rarely  use  the  term  mapping  in  such  generality.  The  special  case  of  real-valued  functions, 
which  is  important  for  us,  will  be  discussed  thoroughly  in  Chap.  2. 

2G.  Cantor,  1845-1918. 
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Fig.  1 .2  Diagonal  in  the  unit 
square 


Proposition  1.2  \fl  £  Q. 

Proof  This  statement  is  proven  indirectly.  Assume  that  y/2  were  rational.  Then 
\/2  can  be  represented  as  a  reduced  fraction  \fl  =  |  e  Q.  Squaring  this  equa¬ 
tion  gives  k2  =  2 n2  and  thus  k 2  would  be  an  even  number.  This  is  only  possible  if 
k  itself  is  an  even  number,  so  k  =  21.  If  we  substitute  this  into  the  above  we  obtain 
4/2  =  2 n2  which  simplifies  to  2 12  =  n2.  Consequently  n  would  also  be  even  which 
is  in  contradiction  to  the  initial  assumption  that  the  fraction  ^  was  reduced.  □ 

As  it  is  generally  known,  \/2  is  the  unique  positive  root  of  the  polynomial  x2  —  2. 
The  naive  supposition  that  all  non-rational  numbers  are  roots  of  polynomials  with 
integer  coefficients  turns  out  to  be  incorrect.  There  are  other  non-rational  numbers 
(so-called  transcendental  numbers)  which  cannot  be  represented  in  this  way.  For 
example,  the  ratio  of  a  circle’s  circumference  to  its  diameter 

7 r  =  3.141592653589793...  £  Q 

is  transcendental,  but  can  be  represented  on  the  real  line  as  half  the  circumference 
of  the  circle  with  radius  1  (e.g.  through  unwinding). 

In  the  following  we  will  take  up  a  pragmatic  point  of  view  and  construct  the 
missing  numbers  as  decimals. 

Definition  1.3  A  finite  decimal  number  v  with  /  decimal  places  has  the  form 


x  =  d=  do.d\d2d 3  . . .  di 

with  do  e  No  and  the  single  digits  di  e  {0,  1, . . . ,  9},  !</</,  with  di  7^  0. 

Proposition  1.4  (Representing  rational  numbers  as  decimals)  Each  rational  num¬ 
ber  can  be  written  as  a  finite  or  periodic  decimal 

Proof  Let  q  e  Q  and  consequently  q  =  jx  with  k  e  Z  and  n  e  N.  One  obtains  the 
representation  of  q  as  a  decimal  by  successive  division  with  remainder.  Since  the 
remainder  r  e  N  always  fulfils  the  condition  0  <  r  <  n,  the  remainder  will  be  zero 
or  periodic  after  a  maximum  of  n  iterations.  □ 

Example  1.5  Let  us  take  q  =  —  |  e  Q  as  an  example.  Successive  division  with  re¬ 
mainder  shows  that  q  =  —0.71428571428571...  with  remainders  5,  1,  3,  2,  6,  4,  5, 
1,  3,  2,  6,  4,  5,  1,  3,  . . .  The  period  of  this  decimal  is  six. 
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Each  nonzero  decimal  with  a  finite  number  of  decimal  places  can  be  written  as  a 
periodic  decimal  (with  an  infinite  number  of  decimal  places).  To  this  end  one  dimin¬ 
ishes  the  last  nonzero  digit  by  one  and  then  fills  the  remaining  infinitely  many  decimal 
places  with  the  digit  9.  For  example,  the  fraction  —  ^  =  —0.34  =  —0.3399999... 
becomes  periodic  after  the  third  decimal  place.  In  this  way  Q  can  be  considered  as 
the  set  of  all  decimals  which  turn  periodic  from  a  certain  number  of  decimal  places 
onwards. 

Definition  1.6  The  set  of  real  numbers  R  consists  of  all  decimals  of  the  form 

=h  do.d\d2d^... 

with  do  G  No  and  digits  d;  e  {0,  ...,  9},  i.e.  decimals  with  an  infinite  number  of 
decimal  places.  The  set  R  \  Q  is  called  the  set  of  irrational  numbers. 

Obviously  QcK.  According  to  what  was  mentioned  so  far  the  numbers 

0.1010010001000010...  and  72 

are  irrational.  There  are  much  more  irrational  than  rational  numbers,  as  is  shown  by 
the  following  proposition. 

Proposition  1.7  The  set  R  is  not  countable  and  has  therefore  higher  cardinality 
than  Q. 

Proof  This  statement  is  proven  indirectly.  Assume  the  real  numbers  between  0  and 
1  to  be  countable  and  tabulate  them: 

1  0.  dn  d\2  di3  di4... 

2  0.^21^22^23^24-.. 

3  0.^3 1^32^33^34... 

4  0.  ^41  ^42  ^43  ^44 . . . 


With  the  help  of  this  list,  we  define 


if  da  =  2, 
else. 


Then  v  =  O.d^d^d^...  is  not  included  in  the  above  list  which  is  a  contradiction  to 
the  initial  assumption  of  countability.  □ 


1.1  The  Real  Numbers 
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Fig.  1 .3  Babylonian  cuneiform  inscription  YBC  7289  (Yale  Babylonian  Collection,  with  authori¬ 
sation)  from  1900  before  our  time  with  a  translation  of  the  inscription  according  to  [1].  It  represents 
a  square  with  side  length  30  and  diagonals  42;  25,  35.  The  ratio  is  V2  ~  1;  24,  51,  10 


However,  although  R  contains  considerably  more  numbers  than  Q,  every  real 
number  can  be  approximated  by  rational  numbers  to  any  degree  of  accuracy,  e.g. 
7T  to  nine  digits 


314159265 

100000000 


Good  approximations  to  the  real  numbers  are  sufficient  for  practical  applications. 
For  y/2,  already  the  Babylonians  were  aware  of  such  approximations: 


*  1;  24, 51, 10  =  1  +  -  +  ^  =  1.41421296... . 

see  Fig.  1 .3.  The  somewhat  unfamiliar  notation  is  due  to  the  fact  that  the  Babylonians 
worked  in  the  sexagesimal  system  with  base  60. 


1 .2  Order  Relation  and  Arithmetic  on  M 

In  the  following  we  write  real  numbers  (uniquely)  as  decimals  with  an  infinite  number 
of  decimal  places,  for  example,  we  write  0.2999...  instead  of  0.3. 

Definition  1.8  (Order  relation)  Let  a  =  ao.a\ci2...  and  b  =  Z?o-^i^2--«  be  non¬ 
negative  real  numbers  in  decimal  form,  i.e.  ao,  bo  e  No. 

(a)  One  says  that  a  is  less  than  or  equal  to  b  (and  writes  a  <  b),  if  a  =  b  or  if  there 
is  an  index  j  e  No  such  that  aj  <  bj  and  ai  =  bi  for  i  =  0, . . . ,  j  —  1. 

(b)  Furthermore  one  stipulates  that  always  —a<b  and  sets  —a  <  —b  whenever 
b  <  a. 

This  definition  extends  the  known  orders  of  N  and  Q  to  R.  The  interpretation  of 
the  order  relation  <  on  the  real  line  is  as  follows:  a  <  b  holds  true,  if  a  is  to  the  left 
of  b  on  the  real  line,  or  a  =  b. 
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The  relation  <  obviously  has  the  following  properties.  For  all  a,  b,  c  e  R  it  holds 
that 


a  <  a  (reflexivity ) , 

a  <  b  and  b  <  c  =>  a  <  c  (transitivity), 

a  <  b  and  b  <  a  =>  a  =  b  (antisymmetry). 

In  case  of  a  <  b  and  a  /  b  one  writes  a  <  b  and  calls  a  less  than  b.  Furthermore 

one  defines  a  >  b,  if  b  <  a  (in  words:  a  greater  than  or  equal  to  b ),  and  a  >  b,  if 

b  <  a  (in  words:  a  greater  than  b). 

Addition  and  multiplication  can  be  carried  over  from  Q  to  R  in  a  similar  way. 
Graphically  one  uses  the  fact  that  each  real  number  corresponds  to  a  segment  on 
the  real  line.  One  thus  defines  the  addition  of  real  numbers  as  the  addition  of  the 
respective  segments. 

A  rigorous  and  at  the  same  time  algorithmic  definition  of  the  addition  starts  from 
the  observation  that  real  numbers  can  be  approximated  by  rational  numbers  to  any 
degree  of  accuracy.  Let  a  =  ao.a\a2 ...  and  b  =  bo.b\b2---  be  two  non-negative  real 
numbers.  By  cutting  them  off  after  k  decimal  places  we  obtain  two  rational  approxi¬ 
mations  a^  =  ao.a\a2...ak  ~  a  and  =  bo.b\b2...bk  %  b.  Then  is  a 

monotonically  increasing  sequence  of  approximations  to  the  yet  to  be  defined  num¬ 
ber  a  +  b.  This  allows  one  to  define  a  +  b  as  supremum  of  these  approximations. 
To  justify  this  approach  rigorously  we  refer  to  Chap.  5.  The  multiplication  of  real 
numbers  is  defined  in  the  same  way.  It  turns  out  that  the  real  numbers  with  addition 
and  multiplication  (R,  +,  •)  are  a  field.  Therefore  the  usual  rules  of  calculation  apply, 
e.g.,  the  distributive  law 

(a  +  b)c  =  ac  +  be. 

The  following  proposition  recapitulates  some  of  the  important  rules  for  <.  The 
statements  can  easily  be  verified  with  the  help  of  the  real  line. 

Proposition  1.9  For  all  a,  b,  c  e  R  the  following  holds: 

a  <  b  =>  a  +  c  <  b  +  c, 
a  <  b  and  c  >  0  =>►  ac  <  be, 

a  <  b  and  c  <  0  =>-  ac  >  be. 

Note  that  a  <  b  does  not  imply  a2  <  b2.  For  example  —2  <  1,  but  nonetheless 
4  >  1.  However,  for  a,  b  >  0  it  always  holds  that  a  <  b  a2  <  b2. 

Definition  1.10  (Intervals)  The  following  subsets  of  R  are  called  intervals: 

[a,  b]  =  {x  G  R  ;  a  <  x  <  b]  closed  interval; 

(a,  b]  =  {x  g  R  ;  a  <  x  <  b]  left  half-open  interval; 

[a,  b)  =  {x  e  R  ;  a  <  x  <  b]  right  half-open  interval; 

(a,  b)  =  {x  G  R  ;  a  <  x  <  b]  open  interval. 


1 .2  Order  Relation  and  Arithmetic  on  R 
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- O  O - •  # - Q  # 

a  b  c  d  e  f 

Fig.  1.4  The  intervals  (a,  b ),  [c,  d]  and  (e,  f  ]  on  the  real  line 


Intervals  can  be  visualised  on  the  real  line,  as  illustrated  in  Fig.  1.4. 

It  proves  to  be  useful  to  introduce  the  symbols  —  oo  (minus  infinity)  and  oo 
(infinity),  by  means  of  the  property 

Va  e  R  :  — oo  <  a  <  oo. 

One  may  then  define,  e.g.,  the  improper  intervals 

[a,  oo)  =  {x  e  R  ;  x  >  a] 

(—oo,  b)  =  {x  e  R  ;  x  <  b} 

and  furthermore  (— oo,  oo)  =  R.  Note  that  —  oo  and  oo  are  only  symbols  and  not 
real  numbers. 


Definition  1.11  The  absolute  value  of  a  real  number  a  is  defined  as 

' 

I  a,  if  a  >  0, 

—a,  if  a  <  0. 

As  an  application  of  the  properties  of  the  order  relation  given  in  Proposition  1.9 
we  exemplarily  solve  some  inequalities. 


Example  1.12  Find  all  x  e  R  satisfying  —  3x  —  2  <  5  <  —  3x  +  4.  In  this  example 
we  have  the  following  two  inequalities 

—3x  —  2  <  5  and  5  <  —3x  +  4. 

The  first  inequality  can  be  rearranged  to 

7 

—3x  <1  4^  x  >  — . 

“3 

This  is  the  first  constraint  for  x.  The  second  inequality  states 

1 

3x<— 1  44  x  <  — 

3 

and  poses  a  second  constraint  for  x.  The  solution  to  the  original  problem  must  fulfil 
both  constraints.  Therefore  the  solution  set  is 


■xgR; 


7  1 

—  <  x  <  — 
3  “  3 
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Example  1.13  Find  all  a:  eR  satisfying  x2  —  2x  >  3.  By  completing  the  square  the 
inequality  is  rewritten  as 

(x  —  l)2  =  x2  —  2x  +  1  >  4. 

Taking  the  square  root  we  obtain  two  possibilities 

x  —  l  >2  or  a  —  1  <  —2. 

The  combination  of  those  gives  the  solution  set 

5  =  {x  G  R  ;  a:  >  3  or  a;  <  —1}  =  (— oo,  —1]  U  [3,  oo). 


1 .3  Machine  Numbers 

The  real  numbers  can  be  realised  only  partially  on  a  computer.  In  exact  arithmetic, 
like  for  example  in  maple,  real  numbers  are  treated  as  symbolic  expressions,  e.g. 
\/2  =  RootOf  (_Z~2-2 ) .  With  the  help  of  the  command  evalf  they  can  be 
evaluated,  exact  to  many  decimal  places. 

The  floating  point  numbers  that  are  usually  employed  in  programming  languages 
as  substitutes  for  the  real  numbers  have  a  fixed  relative  accuracy,  e.g.  double  precision 
with  52  bit  mantissa.  The  arithmetic  rules  of  R  are  not  valid  for  these  machine 
numbers,  e.g. 

1  +  10"20  =  1 

in  double  precision.  Floating  point  numbers  are  standardised  by  the  Institute  of 
Electrical  and  Electronics  Engineers  IEEE  754-1985  and  by  the  International  Elec¬ 
trotechnical  Commission  IEC  559:1989.  In  the  following  we  give  a  short  outline  of 
these  machine  numbers.  Further  information  can  be  found  in  [20]. 

One  distinguishes  between  single  and  double  format.  The  single  format  (, single 
precision )  requires  32-bit  storage  space 


y 


M 


l 


8 


23 


The  double  format  (< double  precision)  requires  64-bit  storage  space 


y 


1 


11 


M 


52 


Here,  V  e  {0,  1}  denotes  the  sign,  em[n  <  e  <  emax  is  the  exponent  (a  signed  integer) 
and  M  is  the  mantissa  of  length  p 


M  —  d\2  i  +  df2  2  dpi  ^  =  d\d 2  •  •  •  dp, 


dj  G  {0,  1}. 


1.3  Machine  Numbers 
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q  2emin_1  26min 

Fig.  1 .5  Floating  point  numbers  on  the  real  line 


2emin  “l-  1 


This  representation  corresponds  to  the  following  number  v : 

p 

X  =  (~l)v2eJ2dj2~j- 

j= 1 

Normalised  floating  point  numbers  in  base  2  always  have  d\  =  1.  Therefore,  one 
does  not  need  to  store  di  and  obtains  for  the  mantissa 

single  precision  p  =  24; 
double  precision  p  =  53. 

To  simplify  matters  we  will  only  describe  the  key  features  of  floating  point  numbers. 
For  the  subtleties  of  the  IEEE-IEC  standard,  we  refer  to  [20] . 

In  our  representation  the  following  range  applies  for  the  exponents: 

Enin  ^max 

single  precision  —125  128 

double  precision  —1021  1024 

With  M  =  Mmax  and  e  =  emSLX  one  obtains  the  largest  floating  point  number 

*max  =  (1  -  2“P)  2em- 

whereas  M  =  Mm jn  and  e  =  em[n  gives  the  smallest  positive  (normalised)  floating 
point  number 

xr  ■  9  ^min —  1 

Arain  —  ^ 

The  floating  point  numbers  are  not  evenly  distributed  on  the  real  line,  but  their  relative 
density  is  nearly  constant,  see  Fig.  1.5. 

In  the  IEEE  standard  the  following  approximate  values  apply: 

Enin  Enax 

single  precision  1.18  •  10-38  3.40  •  1038 

double  precision  2.23  •  10-308  1.80  •  10308 

Furthermore,  there  are  special  symbols  like 

dzINF  ...  ±00 

NaN  . . .  not  a  number,  e.g.  for  zero  divided  by  zero. 

In  general,  one  can  continue  calculating  with  these  symbols  without  program  termi¬ 
nation. 
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1.4  Rounding 

Let  x  =  a  -  2e  eM  with  1/2  <  a  <  1  and  xm[n  <  x  <  vmax.  Furthermore,  let 
u,  v  be  two  adjacent  machine  numbers  with  u  <  x  <  v.  Then 


u  = 


and 


v  =  u  + 


0 

e 

00...  01 

—  U  - b 

0 

e  -  (p  -  1) 


10 ...  00 


Thus  v  —  u  =  2e  p  and  the  inequality 

|rd(v)  —  x\  <  -(v  —  u)  =  2^_/?_1 

holds  for  the  optimal  rounding  rd(x)  of  x.  With  this  estimate  one  can  determine  the 
relative  error  of  the  rounding.  Due  to  ^  <  2  it  holds  that 

\rd(x)  —  x\  2e~p~l  i 

— - I  <  -  <  2  •  2~p~l  =  2~p . 

x  a  -2e 

The  same  calculation  is  valid  for  negative  x  (by  using  the  absolute  value). 
Definition  1.14  The  number  eps  =  2~p  is  called  relative  machine  accuracy. 

The  following  proposition  is  an  important  application  of  this  concept. 
Proposition  1.15  Let  x  e  R  with  vmin  <  \x\  <  vmax.  Then  there  exists  6Gt  with 


rd(x)  =  x(\  +  e)  and  \e\  <  eps. 


Proof  We  define 

rd(v)  —  v 
£  =  - . 

According  to  the  calculation  above,  we  have  \e\  <  eps.  □ 


Experiment  1.16  (Experimental  determination  of  eps)  Let  z  be  the  smallest  pos¬ 
itive  machine  number  for  which  1  +  z  >  L 


1  = 


0 

1 

100 .. . 00 

M 

II 

0 

1 

000... 01 

=  2 • 2~p . 


Thus  z  =  2  eps.  The  number  z  can  be  determined  experimentally  and  therefore  eps 
as  well.  (Note  that  the  number  z  is  called  eps  in  MATLAB.) 


1.4  Rounding 
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In  IEC/IEEE  standard  the  following  applies: 

single  precision:  eps  =  2-24  ~  5.96  •  10-8, 
double  precision :  eps  =  2-53  &  1.11  •  10-16. 

In  double  precision  arithmetic  an  accuracy  of  approximately  16  places  is  available. 


1 .5  Exercises 

1.  Show  that  V3  is  irrational. 

2.  Prove  the  triangle  inequality 


a  +  b\  <  \a\  +  \b\ 


for  all  a,  b  e  M. 

Hint.  Distinguish  the  cases  where  a  and  b  have  either  the  same  or  different  signs. 

3.  Sketch  the  following  subsets  of  the  real  line: 

A  =  {x  :  \x\  <  1},  B  =  {x  :  \x  —  1|  <  2},  C  =  {x  :  \x\  >  3}. 


More  generally,  sketch  the  set  Ur(a)  =  {x  :  \x  —  a\  <  r]  (for  a  e  R,  r  >  0). 
Convince  yourself  that  f/r(a)  is  the  set  of  points  of  distance  less  than  r  to  the 
point  a. 

4.  Solve  the  following  inequalities  by  hand  as  well  as  with  maple  (using  solve). 
State  the  solution  set  in  interval  notation. 


(a) 

(c) 

(e) 

(g) 


4x2  <  8x  +  1, 


2  ~  X‘ 


>  x 


x2  <  6  +  v, 

|1  —  x2\  <  2x  +  2, 


(b) 

(d) 

(f) 


1 


3  - 

X 

i  + 

X 

1  - 

X 

\x\ 

4x 2 

_ 

>  3  H-  x , 

>  1, 

>  1, 


(h)  4x  -  13v  +  4  <  1. 


5.  Determine  the  solution  set  of  the  inequality 

20 

8(jc  —  2)  >  - +  3(x  —  7). 

v  +  1 


6.  Sketch  the  regions  in  the  (x,  y) -plane  which  are  given  by 


(a)  v  =  y;  (b)  y  <  x;  ( c )  y  >  x;  (d)  y  >  |v|;  (e)  \y  \  >  \x 


Hint.  Consult  Sects.  A.  1  and  A. 6  for  basic  plane  geometry. 
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7.  Compute  the  binary  representation  of  the  floating  point  number  v  =  0. 1  in  single 
precision  IEEE  arithmetic. 

8.  Experimentally  determine  the  relative  machine  accuracy  eps. 

Hint.  Write  a  computer  program  in  your  programming  language  of  choice  which 
calculates  the  smallest  machine  number  z  such  that  1  +  z  >  1 . 


® 

Check  for 
updates 


Real-Valued  Functions 


The  notion  of  a  function  is  the  mathematical  way  of  formalising  the  idea  that  one 
or  more  independent  quantities  are  assigned  to  one  or  more  dependent  quantities. 
Functions  in  general  and  their  investigation  are  at  the  core  of  analysis.  They  help  to 
model  dependencies  of  variable  quantities,  from  simple  planar  graphs,  curves  and 
surfaces  in  space  to  solutions  of  differential  equations  or  the  algorithmic  construction 
of  fractals.  One  the  one  hand,  this  chapter  serves  to  introduce  the  basic  concepts.  On 
the  other  hand,  the  most  important  examples  of  real- valued,  elementary  functions 
are  discussed  in  an  informal  way.  These  include  the  power  functions,  the  exponential 
functions  and  their  inverses.  Trigonometric  functions  will  be  discussed  in  Chap.  3, 
complex- valued  functions  in  Chap.  4. 


2.1  Basic  Notions 

The  simplest  case  of  a  real- valued  function  is  a  double-row  list  of  numbers,  consisting 
of  values  from  an  independent  quantity  v  and  corresponding  values  of  a  dependent 
quantity  y. 

Experiment  2.1  Study  the  mapping  y  =  x2  with  the  help  of  MATLAB.  First  choose 
the  region  D  in  which  the  v -values  should  vary,  for  instance  D  =  jx  e  I  :  -1  < 
x  <  1}.  The  command 

x  =  —1  :  0.01  :  1; 

produces  a  list  of  v -values,  the  row  vector 

x  =  [x\,  X2, . . . ,  xn]  =  [—1.00,  —0.99,  —0.98, . . . ,  0.99,  1.00]. 
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2  Real- Valued  Functions 


Using 


y  =  x  .  ~  2  ; 

a  row  vector  of  the  same  length  of  corresponding  3; -values  is  generated.  Finally 
plot  (x ,  y )  plots  the  points  (x\ ,  y\ ),...,  (xn,  yn)  in  the  coordinate  plane  and  con¬ 
nects  them  with  line  segments.  The  result  can  be  seen  in  Fig.  2.1. 


In  the  general  mathematical  framework  we  do  not  just  want  to  assign  finite  lists 
of  values.  In  many  areas  of  mathematics  functions  defined  on  arbitrary  sets  are 
needed.  For  the  general  set-theoretic  notion  of  a  function  we  refer  to  the  literature, 
e.g.  [3,  Chap.  0.2].  This  section  is  dedicated  to  real-valued  functions,  which  are 
central  in  analysis. 


Definition  2.2  A  real- valued  function  /  with  domain  D  and  range  R  is  a  rule  which 
assigns  to  every  v  e  D  a  real  number  y  e  R. 


In  general,  D  is  an  arbitrary  set.  In  this  section, 
however,  it  will  be  a  subset  of  R.  For  the  expres¬ 
sion  function  we  also  use  the  word  mapping  synony¬ 
mously.  A  function  is  denoted  by 

f  :  D  ^R:  x  \-^  y  =  fix). 

The  graph  of  the  function  f  is  the  set 


Fig.  2.1  A  function 


r{f)  =  {(*,  y)  e  D  X  R ;  y  =  /(*)}. 


In  the  case  of  D  c  R  the  graph  can  also  be  represented  as  a  subset  of  the  coordinate 
plane.  The  set  of  the  actually  assumed  values  is  called  image  of  f  or  proper  range : 

f(D)  =  {f(x);  xeD}. 


Example  2.3  A  part  of  the  graph  of  the  quadratic  function  /  :  R  — >  R,  fix)  =  x2 
is  shown  in  Fig.  2.2.  If  one  chooses  the  domain  to  be  D  =  R,  then  the  image  is  the 
interval  f(D)  —  [0,  00). 


An  important  tool  is  the  concept  of  inverse  functions,  whether  to  solve  equations 
or  to  find  new  types  of  functions.  If  and  in  which  domain  a  given  function  has  an 
inverse  depends  on  two  main  properties,  the  injectivity  and  the  surjectivity,  which 
we  investigate  on  their  own  at  first. 

Definition  2.4  (a)  A  function  /  :  D  R  is  called  injective  or  one-to-one,  if  differ¬ 
ent  arguments  always  have  different  function  values: 


X\  7^  X2  ==^  fix  l)  7^  f  (A2). 
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Fig.  2.2  Quadratic  function 


(b)  A  function  /  :  D  — >  B  C  R  is  called  surjective  or  cwto  from  D  to  B,  if  each 
y  e  B  appears  as  a  function  value: 

Wy  e  B  3x  e  D  :  y  =  f(x). 

(c)  A  function  /  :  D  — >  B  is  called  bijective ,  if  it  is  injective  and  surjective. 
Figures  2.3  and  2.4  illustrate  these  notions. 

Surjectivity  can  always  be  enforced  by  reducing  the  range  B\  for  example, 
/  :  D  — >  f(D)  is  always  surjective.  Likewise,  injectivity  can  be  obtained  by  restrict¬ 
ing  the  domain  to  a  subdomain. 

If  /  :  D  — >  B  is  bijective,  then  for  every  y  e  B  there  exists  exactly  one  x  e  D 
withy  =  fix).  The  mapping  y  x  then  defines  the  inverse  of  the  mapping  x  \-+  y. 

Definition  2.5  If  the  function 

/  :  D  ->  B  :  y  =  f(x), 
is  bijective,  then  the  assignment 

f-1  :  B  ^  D:x  = 


I 

I 

=  x2 

not  in; 

_ i _ 

X 

ective 

_ i _ 

-1  0  1 


Fig.  2.3  Injectivity 
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Fig.  2.4  Surjectivity 


Fig.  2.5  Bijectivity  and 
inverse  function 


which  maps  each  y  e  B  to  the  unique  x  e  D  with  y  =  f(x)  is  called  the  inverse 
function  of  the  function  /. 

Example  2.6  The  quadratic  function  f(x)  =  x2  is  bijective  from  D  =  [0,  oo)  to 
B  =  [0,  oo).  In  these  intervals  (x  >  0,  y  >  0)  one  has 

j  =  x2  <$■  x  =  yy. 

Here  *Jy  denotes  the  positive  square  root.  Thus  the  inverse  of  the  quadratic  function 
on  the  above  intervals  is  given  by  /-1  (y)  =  ;  see  Fig.  2.5. 

Once  one  has  found  the  inverse  function  f~l,  it  is  usually  written  with  variables 
y  =  f~l(x).  This  corresponds  to  flipping  the  graph  of  y  =  f(x)  about  the  diagonal 
y  =  x,  as  is  shown  in  Fig.  2.6. 

Experiment  2.7  The  term  inverse  function  is  clearly  illustrated  by  the  MATLAB  plot 
command.  The  graph  of  the  inverse  function  can  easily  be  plotted  by  interchanging 
the  variables,  which  exactly  corresponds  to  flipping  the  lists  y  x.  For  example, 


2.1  Basic  Notions 
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Fig.  2.6  Inverse  function 
and  reflection  in  the  diagonal 


the  graphs  in  Fig.  2.6  are  obtained  by 

x  =  0:0.01:1; 
y  =  x  .  ~  2  ; 
plot (x,y) 
hold  on 
plot (y , x) 

How  the  formatting,  the  dashed  diagonal  and  the  labelling  are  obtained  can  be  learned 
from  the  M-file  ma  1 0  2  1 .  m. 


2.2  Some  Elementary  Functions 

The  elementary  functions  are  the  powers  and  roots,  exponential  functions  and  loga¬ 
rithms,  trigonometric  functions  and  their  inverse  functions,  as  well  as  all  functions 
which  are  obtained  by  combining  these.  We  are  going  to  discuss  the  most  important 
basic  types  which  have  historically  proven  to  be  of  importance  for  applications.  The 
trigonometric  functions  will  be  dealt  with  in  Chap.  3. 


Linear  functions  (straight  lines).  A  linear  function  R  — >  R  assigns  each  v -value  a 
fixed  multiple  as  y-value,  i.e., 


Here 


increase  in  height  Ay 
increase  in  length  Ax 

is  the  slope  of  the  graph,  which  is  a  straight  line  through  the  origin.  The  connection 
between  the  slope  and  the  angle  between  the  straight  line  and  v-axis  is  discussed  in 
Sect.  3.1.  Adding  an  intercept  d  e  R  translates  the  straight  line  d  units  in  y -direction 
(Fig.  2.7).  The  equation  is  then 


y  =  kx  +  d. 
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Fig.  2.7  Equation  of  a  straight  line 


Quadratic  parabolas.  The  quadratic  function  with  domain  D  =  R  in  its  basic  form 
is  given  by 


Compression/stretching,  horizontal  and  vertical  translation  are  obtained  via 

y  =  ax2,  y  =  (x-p)2,  y=x2+  7. 

The  effect  of  these  transformations  on  the  graph  can  be  seen  in  Fig.  2.8. 

a  >  1  ...  compression  in  v -direction 
0  <  a  <  1  ...  stretching  in  v -direction 
a  <  0  . . .  reflection  in  the  v-axis 
/3  >  0  . . .  translation  to  the  right  7  >  0  . . .  translation  upwards 

/ 3  <0  ...  translation  to  the  left  7  <  0  . . .  translation  downwards 

The  general  quadratic  function  can  be  reduced  to  these  cases  by  completing  the 
square : 


y  =  ax  +  bx  +  c 

/  b  \ 2  b 2 
V  2a)  4  a 

=  a(x  -  (3)2  +  7. 

Power  functions.  In  the  case  of  an  integer  exponent  n  e  N  the  following  rules  apply 

xn  =  x  •  x  •  x . x  ( n  factors),  xl  =  x, 

v°  =  1,  x~n  =  —  (1  /  0). 

xn 

The  behaviour  of  y  =  v3  can  be  seen  in  the  picture  on  the  right-hand  side  of 
Fig.  2.3,  the  one  of  y  =  x4  in  the  picture  on  the  left-hand  side  of  Fig.  2.4.  The  graphs 
for  odd  and  even  powers  behave  similarly. 
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Fig.  2.9  Power  functions  with  fractional  and  negative  exponents 


As  an  example  of  fractional  exponents  we  consider  the  root  functions  y  = 
yfx  =  xl^n  for  n  e  N  with  domain  D  =  [0,  oo).  Here  y  =  y/x  is  defined  as  the 
inverse  function  of  the  nth  power,  see  Fig.  2.9  left.  The  graph  of  y  =  x  ~ 1  with  domain 
D  =  R  \  {0}  is  pictured  in  Fig.  2.9  right. 

Absolute  value,  sign  and  indicator  function.  The  graph  of  the  absolute  value 
function 


has  a  kink  at  the  point  (0,  0),  see  Fig.  2.10  left. 

The  graph  of  the  sign  function  or  si gnum  function 

II,  x  >  0, 
0,  x  =  0, 
—  1,  x  <  0 
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Fig.  2.10  Absolute  value  and  sign 

has  a  jump  at  v  =  0  (Fig.  2.10  right).  The  indicator  function  of  a  subset  A  C  Mis 
defined  as 


lA(x) 


1 ,  X  G  A , 

0,  x  £  A. 


Exponential  functions  and  logarithms.  Integer  powers  of  a  number  a  >  0  have 
just  been  defined.  Fractional  (rational)  powers  give 

a1/”  =  f/a,  am!n  =  ( Hfa)m  = 

If  r  is  an  arbitrary  real  number  then  ar  is  defined  by  its  approximations  am'n ,  where 
^  is  the  rational  approximation  to  r  obtained  by  decimal  expansion. 

Example  2.8  2n  is  defined  by  the  sequence 

93  ?3.1  93.14  93.141  ?3.1415 

Zs  y  Z*  y  Zj  y  Zj  j  Zj  j  ^ 


where 


23.1  =  231/10  =  7^31.  23.14  =  2314/100  =  *«^i4  .  ...  etc. 

This  somewhat  informal  introduction  of  the  exponential  function  should  be  suffi¬ 
cient  to  have  some  examples  at  hand  for  applications  in  the  following  sections.  With 
the  tools  we  have  developed  so  far  we  cannot  yet  show  that  this  process  of  approx¬ 
imation  actually  leads  to  a  well-defined  mathematical  object.  The  success  of  this 
process  is  based  on  the  completeness  of  the  real  numbers.  This  will  be  thoroughly 
discussed  in  Chap.  5. 

From  the  definition  above  we  obtain  that  the  following  rules  of  calculation  are 
valid  for  rational  exponents: 


(ar)*  =  ars  =  (as)r 
arbr  =  (ab)r 
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for  a,  b  >  0  and  arbitrary  r,  s  e  Q.  The  fact  that  these  rules  are  also  true  for  real¬ 
valued  exponents  r,  s  e  R  can  be  shown  by  employing  a  limiting  argument. 

The  graph  of  the  exponential  function  with  base  a ,  the  function  y  =  ax ,  increases 
for  a  >  1  and  decreases  for  a  <  1,  see  Fig.  2.11.  Its  proper  range  is  B  =  (0,  oo); 
the  exponential  function  is  bijective  from  R  to  (0,  oo).  Its  inverse  function  is  the 
logarithm  to  the  base  a  (with  domain  (0,  oo)  and  range  R): 

y  =  ax  O  x  =  loga  y. 


For  example,  log10  2  is  the  power  by  which  10  needs  to  be  raised  to  obtain  2: 


2  =  10logl°2. 


Other  examples  are,  for  instance: 


2  =  log10(102),  log10  10  =  1,  log10  1  =  0,  log10  0.001  =  -3. 

Euler’s  number  e  is  defined  by 


111  1 

e=l  +  -  +  -  +  -  +  ~  T  . . . 
1  2  6  24 

1111 
—  IT  —  T  —  T  —  T  —  T  •  • 
1!  2!  3!  4! 


OO 

=  y- 

^  /t 
7—0  J ' 


2.718281828459045235360287471 


That  this  summation  of  infinitely  many  numbers  can  be  defined  rigorously  will  be 
proven  in  Chap.  5  by  invoking  the  completeness  of  the  real  numbers.  The  logarithm 
to  the  base  e  is  called  natural  logarithm  and  is  denoted  by  log: 

log  v  =  loge  v 


!L.  Euler,  1707-1783. 
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3 


0  2  4  6  8  10 


Fig.  2.1 2  Logarithms  to  the  base  e  and  to  the  base  10 


In  some  books  the  natural  logarithm  is  denoted  by  In  v .  We  stick  to  the  notation  log  v 
which  is  used,  e.g.,  in  MATLAB.  The  following  rules  are  obtained  directly  by  rewriting 
the  rules  for  the  exponential  function: 

u  =  el0SH 

1  og(uv)  =  log  u  +  log  V 
1  Og(uZ)  =  zlogu 

for  u,  v  >  0  and  arbitrary  z  €  M.  In  addition,  it  holds  that 

u  =  log(eM) 

for  all  mgR,  and  log  e  =  1 .  In  particular  it  follows  from  the  above  that 

1  v 

log  -  =  —  log  u,  log  -  =  log  v  —  log  u. 
u  u 

The  graphs  of  y  =  log  x  and  y  =  log10  x  are  shown  in  Fig.  2.12. 

Hyperbolic  functions  and  their  inverses.  Hyperbolic  functions  and  their  inverses 
will  mainly  be  needed  in  Chap.  14  for  the  parametric  representation  of  hyperbo¬ 
las,  in  Chap.  10  for  evaluating  integrals  and  in  Chap.  19  for  explicitly  solving  some 
differential  equations. 

The  hyperbolic  sine ,  the  hyperbolic  cosine  and  the  hyperbolic  tangent  are  defined 
by 

sinhv  =  -fex  —  e_*Y  coshv  =  -fex  +  e_*Y  tanhv  =  — - 

2  V  /  2  V  /  cosh  x 
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Fig.  2.1 3  Hyperbolic  sine  and  cosine  (left),  and  hyperbolic  tangent  (right) 


for  x  e  R.  Their  graphs  are  displayed  in  Fig.  2.13.  An  important  property  is  the 
identity 

9  9 

cosh  x  —  sinh  x  =  1 , 

which  can  easily  be  verified  by  inserting  the  defining  expressions. 

Figure  2.13  shows  that  the  hyperbolic  sine  is  invertible  as  a  function  from  R  ->  R, 
the  hyperbolic  cosine  is  invertible  as  a  function  from  [0,  oo)  — ►  [1,  oo),  and  the 
hyperbolic  tangent  is  invertible  as  a  function  from  R  ->  (—  1,  1).  The  inverse  hyper¬ 
bolic  functions ,  also  known  as  area  functions ,  are  referred  to  as  inverse  hyper¬ 
bolic  sine  (cosine,  tangent )  or  area  hyperbolic  sine  ( cosine ,  tangent).  They  can  be 
expressed  by  means  of  logarithms  as  follows  (see  Exercise  15): 

for  v  e  R, 
forx  >  1, 

for  lx  I  <  1. 


arsinhx  =  log(x  +  \j  x1  +  l), 
arcoshx  =  log(x  +  \j x1  —  l), 

artanh  x  =  -  log  ; 

2  1  —  x 


2.3  Exercises 

1.  How  does  the  graph  of  an  arbitrary  function  y  =  f(x)  :  R  R  change  under 
the  transformations 

y  =  f(ax),  y  =  f(x-  b ),  y  =  cf(x),  y  =  f(x)  +  d 

with  a,  b,  c,  d  e  R?  Distinguish  the  following  different  cases  for  a: 

a  <  —l,  —\<a<0,  0  <  a  <  1,  a  >  1, 

and  for  b,  c,  d  the  cases 

b,  c,  d  >  0,  b,  c,  d  <  0. 


Sketch  the  resulting  graphs. 
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2.  Let  the  function  /  :D^R:xk  3x 4  —  2v3  —  3v2  +  1  be  given.  Using 
MATLAB  plot  the  graphs  of  /  for 

D  =  [-1,  1.5],  D  =  [-0.5,  0.5],  D  =  [0.5,  1.5]. 

Explain  the  behaviour  of  the  function  for  D  =  R  and  find 

/([-l,  1.5]),  / ((—0.5,  0.5)),  /((-oc,  1]). 

3.  Which  of  the  following  functions  are  injective/surjective/bijective? 


/ 

:N- 

»  N  : 

n 

i-^  n2  —  6n  +  10; 

8 

:  R  - 

»  R  : 

X 

i — ^  | x  1 1  —  3; 

h 

:  R  - 

»  R  : 

X 

3 

x  . 

Hint.  Illustrative  examples  for  the  use  of  the  MATLAB  plot  command  may  be 
found  in  the  M-file  mat  02_2  .  m. 

4.  Sketch  the  graph  of  the  function  y  =  x2  —  4x  and  justify  why  it  is  bijective  as  a 
function  from  D  =  (— oo,  2]  to  Z?  =  [— 4,  oo).  Compute  its  inverse  function  on 
the  given  domain. 

5.  Check  that  the  following  functions  D  — >  B  are  bijective  in  the  given  regions 
and  compute  the  inverse  function  in  each  case: 

y  =  —2x  +  3,  D  =  R,  B  =  R; 

y  =  x2  +  1,  D  =  (— oo,  0] ,  B  =  [1,  oo) ; 

y  =  x2  —  2x  —  1,  D  =  [1,  oo) ,  B  =  [—2,  oo) . 

6.  Find  the  equation  of  the  straight  line  through  the  points  (1,1)  and  (4,  3)  as  well 
as  the  equation  of  the  quadratic  parabola  through  the  points  (—1,  6),  (0,  5)  and 
(2,21). 

7.  Let  the  amount  of  a  radioactive  substance  at  time  t  =  0  be  A  grams.  According 
to  the  law  of  radioactive  decay,  there  remain  A  •  qf  grams  after  t  days.  Compute 
q  for  radioactive  iodine  131  from  its  half  life  (8  days)  and  work  out  after  how 
many  days  of  the  original  amount  of  iodine  131  is  remaining. 

Hint.  The  half  life  is  the  time  span  after  which  only  half  of  the  initial  amount  of 
radioactive  substance  is  remaining. 

8.  Let  I  [Watt/cm2]  be  the  sound  intensity  of  a  sound  wave  that  hits  a  detector  sur¬ 
face.  According  to  the  Weber-Fechner  law,  its  sound  level  L  [Phon]  is  computed 
by 

L  =  101og10(///o) 

where  Iq  =  10“ 16  W/cm2.  If  the  intensity  I  of  a  loudspeaker  produces  a  sound 
level  of  80  Phon,  which  level  is  then  produced  by  an  intensity  of  21  by  two 
loudspeakers? 
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9.  For  xgR  the  floor  function  |_*J  denotes  the  largest  integer  not  greater  than  x, 
i.e., 

|_vj  =  max  j/ieN;  n  <  x}. 

Plot  the  following  functions  with  domain  D  =  [0,  10]  using  the  MATLAB  com¬ 
mand  floor: 

y=  UJ.  y  =  x-  |xj,  j  =  (*  -  |_xj  )3 ,  y  =  (W)3- 


Try  to  program  correct  plots  in  which  the  vertical  connecting  lines  do  not  appear. 

10.  A  function  /  :  D  =  {1,  2, . . . ,  N}  — >  B  =  {1,  2, . . . ,  N}  is  given  by  the  list  of 
its  function  values  y  =  (y\ ,  . . . ,  yy),  y;  =  /(/).  Write  a  MATLAB  program  which 
determines  whether  /  is  bijective.  Test  your  program  by  generating  random  y- 
values  using 

(a)  y  =  unirnd  (N,  1 ,  N) ,  (b)  y  =  randperm(N)  . 


Hint.  See  the  two  M-files  mat02_exl2a  .m  and  mat02_exl2b  .m  or  the 
Python-file  python02_exl2. 

11.  Draw  the  graph  of  the  function  /  :  R  — >  R  :  y  =  ax  +  sign  v  for  different  values 
of  a.  Distinguish  between  the  cases  a  >  0,  a  =  0,  a  <  0.  For  which  values  of  a 
is  the  function  /  injective  and  surjective,  respectively? 

12.  Let  a  >  0,  b  >  0.  Verify  the  laws  of  exponents 

aras  =  ar+s ,  (ar)s  =  ars ,  arZ/  =  (a/?)r 


for  rational  r  =  k/ 1,  s  =  m/n. 

Hint.  Start  by  verifying  the  laws  for  integer  r  and  s  (and  arbitrary  a,  b  >  0).  To 
prove  the  first  law  for  rational  r  =  k/ /,  s  =  m/n,  write 


(ak/lam/n)ln  =  (ak/l)ln(am/n)ln  =  aknalm 


kn+lm 

Cl 


using  the  third  law  for  integer  exponents  and  inspection;  conclude  that 

k/l  m/n  _  (kn+lm) / In  _  k/l+m/n 

Cl  Cl  -  Cl  -  Cl  • 

13.  Using  the  arithmetics  of  exponentiation,  verify  the  rules  1  og(uv)  =  log  u  +  log  v 
and  log  uz  =  z  log  u  for  u,  v  >  0  and  z  6  R. 

Hint.  Set  v  =  log  u,  y  =  log  v ,  so  uv  =  eTev.  Use  the  laws  of  exponents  and 
take  the  logarithm. 

14.  Verify  the  identity  cosh2  v  —  sinh2  x  =  1. 

15.  Show  that  arsinh  v  =  log(v  +  \J  x1  +  l)  for  xei 

Hint.  Set  y  =  arsinhv  and  solve  the  identity  v  =  sinhy  =  ^(ev  —  e~y)  for  y. 
Substitute  u  =  ev  to  derive  the  quadratic  equation  u2  —  2xu  —  1  =0  for  u. 
Observe  that  u  >  0  to  select  the  appropriate  root  of  this  equation. 
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Trigonometry 


Trigonometric  functions  play  a  major  role  in  geometric  considerations  as  well  as  in 
the  modelling  of  oscillations.  We  introduce  these  functions  at  the  right-angled  triangle 
and  extend  them  periodically  to  R  using  the  unit  circle.  Furthermore,  we  will  discuss 
the  inverse  functions  of  the  trigonometric  functions  in  this  chapter.  As  an  application 
we  will  consider  the  transformation  between  Cartesian  and  polar  coordinates. 


3.1  Trigonometric  Functions  at  the  Triangle 

The  definitions  of  the  trigonometric  functions  are  based  on  elementary  properties  of 
the  right-angled  triangle.  Figure  3.1  shows  a  right-angled  triangle.  The  sides  adjacent 
to  the  right  angle  are  called  legs  (or  catheti),  the  opposite  side  hypotenuse. 

One  of  the  basic  properties  of  the  right-angled  triangle  is  expressed  by  Pythagoras’ 
theorem. 

Proposition  3.1  (Pythagoras)  In  a  right-angled  triangle  the  sum  of  the  squares  of 
the  legs  equals  the  square  of  the  hypotenuse.  In  the  notation  of  Fig.  3.1  this  says  that 
a2  +  b2  =  c2. 

Proof  According  to  Fig.  3.2  one  can  easily  see  that 

(a  +  b)  —  c  =  area  of  the  grey  triangles  =  lab. 

From  this  it  follows  that  a2  +  b2  —  c2  =  0.  □ 


Pythagoras,  approx.  570-501  B.C. 
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3  Trigonometry 


Fig.  3.1  A  right-angled 
triangle  with  legs  a,  b  and 
hypotenuse  c 


Fig.  3.2  Basic  idea  of  the 
proof  of  Pythagoras’ 
theorem 


b 


a  b 


A  fundamental  fact  is  Thales’  intercept  theorem  which  says  that  the  ratios  of 
the  sides  in  a  triangle  are  scale  invariant;  i.e.  they  do  not  depend  on  the  size  of  the 
triangle. 

In  the  situation  of  Fig.  3.3  Thales’  theorem  asserts  that  the  following  ratios  are 
valid: 

a  a'  b  b'  a  a' 

c  cr  c  cr  b  br 

The  reason  for  this  is  that  by  changing  the  scale  (enlargement  or  reduction  of  the 
triangle)  all  sides  are  changed  by  the  same  factor.  One  then  concludes  that  the  ratios 
of  the  sides  only  depend  on  the  angle  a  (and  f3  =  90°  —  a,  respectively).  This  gives 
rise  to  the  following  definition. 


2Thales  of  Miletus,  approx.  624-547  B.C. 


3.1  Trigonometric  Functions  at  theTriangle 
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Definition  3.2  (Trigonometric  functions)  For  0°  <  a  <  90°  we  define 


sin  cr 

cos  a 

tana 

cot  a 


a 

c 

b 

c 

a 

b 

b 

a 


opposite  leg 
hypotenuse 
adjacent  leg 
hypotenuse 
opposite  leg 
adjacent  leg 
adjacent  leg 
opposite  leg 


(sine) , 
(cosine), 
(tangent) , 
(cotangent) . 


Note  that  tan  a  is  not  defined  for  a  =  90°  (since  b  =  0)  and  that  cot  a  is  not 
defined  for  a  =  0°  (since  a  =  0).  The  identities 


sin  a  cos  a 

tan  a  =  - ,  cot  a  =  — - ,  sin  a  =  cos  p  =  cos  (90  —  a) 

cos  a  sin  a 


follow  directly  from  the  definition,  the  relationship 

9  9 

sin  a  +  cos  a  =  1 


is  obtained  using  Pythagoras’  theorem. 

The  trigonometric  functions  have  many  applications  in  mathematics.  As  a  first 
example  we  derive  the  formula  for  the  area  of  a  general  triangle;  see  Fig.  3.4.  The 
sides  of  a  triangle  are  usually  labelled  in  counterclockwise  direction  using  lowercase 
Latin  letters,  and  the  angles  opposite  the  sides  are  labelled  using  the  corresponding 
Greek  letters.  Because  F  =  \ch  and  h  =  b  sin  a  the  formula  for  the  area  of  a  triangle 
can  be  written  as 


F  =  -  be  sin  a  =  -  ac  sin  8  =  -  ab  sin  7. 

2  2  2 

So  the  area  equals  half  the  product  of  two  sides  times  the  sine  of  the  enclosed  angle. 
The  last  equality  in  the  above  formula  is  valid  for  reasons  of  symmetry.  There  7 
denotes  the  angle  opposite  to  the  side  c,  in  other  words  7  =  180°  —  a  —  /3. 

As  a  second  example  we  compute  the  slope  of  a  straight  line.  Figure  3.5  shows  a 
straight  line  y  =  kx  +  d.  Its  slope  k  is  the  change  of  the  y -value  per  unit  change  in  v. 
It  is  calculated  from  the  triangle  attached  to  the  straight  line  in  Fig.  3.5  as  k  =  tan  a. 


c 
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3  Trigonometry 


Fig.  3.5  Straight  line  with 
slope  k 


Fig.  3.6  Relationship 
between  degrees  and  radian 
measure 


In  order  to  have  simple  formulas  such  as 

d  . 

—  sin  x  ~  cos  x, 

dx 

one  has  to  measure  the  angle  in  radian  measure.  The  connection  between  degree  and 
radian  measure  can  be  seen  from  the  unit  circle  (i.e.,  the  circle  with  centre  0  and 
radius  1);  see  Fig.  3.6. 

The  radian  measure  of  the  angle  a  (in  degrees)  is  defined  as  the  length  i  of  the 
corresponding  arc  of  the  unit  circle  with  the  sign  of  a.  The  arc  length  t  on  the  unit 
circle  has  no  physical  unit.  However,  one  speaks  about  radians  (rad)  to  emphasise 
the  difference  to  degrees. 

As  is  generally  known  the  circumference  of  the  unit  circle  is  2tt  with  the  constant 

22 

7T  =  3.141592653589793...  «  — . 

7 

For  the  conversion  between  the  two  measures  we  use  that  360°  corresponds  to  2tt  in 
radian  measure,  for  short  360°  2i r  [rad],  so 


a°  - a  [rad] 

180 


and 


i  [rad] 


respectively.  For  example,  90°  |  and  —270° 

measure  angles  in  radians. 


Henceforth  we  always 


3.2  Extension  of  the  Trigonometric  Functions  to  R 
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3.2  Extension  of  the  Trigonometric  Functions  to  R 

For  0  <  a  <  |  the  values  sin  a ,  cos  a,  tan  a  and  cot  a  have  a  simple  interpretation 
on  the  unit  circle;  see  Fig.  3.7.  This  representation  follows  from  the  fact  that  the 
hypotenuse  of  the  defining  triangle  has  length  1  on  the  unit  circle. 

One  now  extends  the  definition  of  the  trigonometric  functions  for  0  <  a  <  2tt  by 
continuation  with  the  help  of  the  unit  circle.  A  general  point  P  on  the  unit  circle, 
which  is  defined  by  the  angle  a ,  is  assigned  the  coordinates 

P  =  (cos  a,  sin  a), 

see  Fig.  3.8.  For  0  —  a  —  2  this  is  compatible  with  the  earlier  definition.  For  larger 
angles  the  sine  and  cosine  functions  are  extended  to  the  interval  [0,  2i r]  by  this 
convention.  For  example,  it  follows  from  the  above  that 

sin  a  =  —  sin(a  —  i r),  cos  a  =  —  cos(a  —  i r) 


for  7r  <  a  <  ,  see  Fig.  3.8. 

For  arbitrary  values  a  e  R  one  finally  defines  sin  a  and  cos  a  by  periodic  contin¬ 
uation  with  period  2i r.  For  this  purpose  one  first  writes  a  =  x  +  2kjr  with  a  unique 
x  g  [0,  27 r)  and  k  e  Z.  Then  one  sets 

sin  a  =  sin  (x  +  2kir)  =  sin  jic,  cos  a  =  cos  (x  +  2kir)  =  cos  v. 


Fig.  3.7  Definition  of  the 
trigonometric  functions  on 
the  unit  circle 


Fig.  3.8  Extension  of  the 
trigonometric  functions  on 
the  unit  circle 
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Fig.  3.9  The  graphs  of  the  sine  and  cosine  functions  in  the  interval  [ — 2tt,  2tt] 


With  the  help  of  the  formulas 


sin  a  cos  a 

tan  a  =  - ,  cot  a  =  — - 

cos  a  sin  a 

the  tangent  and  cotangent  functions  are  extended  as  well.  Since  the  sine  function 
equals  zero  for  integer  multiples  of  i r,  the  cotangent  is  not  defined  for  such  arguments. 
Likewise  the  tangent  is  not  defined  for  odd  multiples  of  | . 

The  graphs  of  the  functions  y  =  sin  x,y  =  cos  v  are  shown  in  Fig.  3.9.  The  domain 
of  both  functions  is  D  =  R. 

The  graphs  of  the  functions  y  =  tanv  and  y  =  cotv  are  presented  in  Fig.  3.10. 
The  domain  D  for  the  tangent  is,  as  explained  above,  given  by  D  =  {x  e  R  ;  x  / 
|  +  kn,  k  e  Z},  the  one  for  the  cotangent  is  D  =  {x  e  R  ;  x  /  kir,  k  e  Z}. 

Many  relations  are  valid  between  the  trigonometric  functions.  For  example,  the 
following  addition  theorems,  which  can  be  proven  by  elementary  geometrical  con¬ 
siderations,  are  valid;  see  Exercise  3.  The  maple  commands  expand  and  combine 
use  such  identities  to  simplify  trigonometric  expressions. 

Proposition  3.3  (Addition  theorems)  For  x ,  y  e  M  it  holds  that 


sin  (x  +  y)  =  sin  x  cos  y  +  cos  v  sin  y, 
cos  (x  +  y)  =  cos  v  cos  y  —  sin  x  sin  y. 


3.3  Cyclometric  Functions 
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Fig.  3.10  The  graphs  of  the  tangent  (left)  and  cotangent  (right)  functions 


3.3  Cyclometric  Functions 


The  cyclometric  functions  are  inverse  to  the  trigonometric  functions  in  the  appropri¬ 
ate  bijectivity  regions. 

Sine  and  arcsine.  The  sine  function  is  bijective  from  the  interval  ( —  |,  |]  to  the 
range  [—1,  1];  see  Fig.  3.9.  This  part  of  the  graph  is  called  principal  branch  of  the 
sine.  Its  inverse  function  (Fig.  3.11)  is  called  arcsine  (or  sometimes  inverse  sine) 


arcsin  :  [—1,  1] 


"  7 T  7 T" 

2  ’  2-  ’ 


According  to  the  definition  of  the  inverse  function  it  follows  that 


sin(arcsiny)  =  y  for  all  y  6  [— 1,  1]. 


However,  the  converse  formula  is  only  valid  for  the  principal  branch;  i.e. 

7 T  7 T 

arcsin  (sin  x)  =  x  is  only  valid  for - <  v  <  — . 

2  2 

For  example,  arcsin(sin4)  =  —0.8584073...  /  4. 

Cosine  and  arccosine.  Likewise,  the  principal  branch  of  the  cosine  is  defined  as 
restriction  of  the  cosine  to  the  interval  [0,  i r]  with  range  [  —  1 ,  1].  The  principal  branch 
is  bijective,  and  its  inverse  function  (Fig.  3.12)  is  called  arccosine  (or  sometimes 
inverse  cosine) 

arccos  :  [— 1,  1]  — >  [0,  tt] . 
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0 


-1 


-2 


-2-101 


Fig.  3.11  The  principal  branch  of  the  sine  (left);  the  arcsine  function  (right) 


2 

1 

0 

-1 

-2 

0  12  3 


Fig.  3.1 2  The  principal  branch  of  the  cosine  (left);  the  arccosine  function  (right) 


Tangent  and  arctangent.  As  can  be  seen  in  Fig.  3.10  the  restriction  of  the  tangent  to 
the  interval  (— f ,  f )  is  bijective.  Its  inverse  function  is  called  arctangent  (or  inverse 
tangent) 


arctan  :  R 


/  7T  7T\ 

V  2’  2/  ' 


To  be  precise  this  is  again  the  principal  branch  of  the  inverse  tangent  (Fig.  3.13). 


Fig.  3.1 3  The  principal  branch  of  the  arctangent 


3.3  Cyclometric  Functions 
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Fig.  3.1 4  Plane  polar 
coordinates 


Application  3.4  (Polar  coordinates  in  the  plane)  The  polar  coordinates  (r,  cp)  of 
a  point  P  =  (x,  y)  in  the  plane  are  obtained  by  prescribing  its  distance  r  from  the 
origin  and  the  angle  cp  with  the  positive  r-axis  (in  counterclockwise  direction); 
see  Fig.  3.14. 

The  connection  between  Cartesian  and  polar  coordinates  is  therefore  described  by 

x  =  r  cos  Lp  , 
y  =  r  sin  ip  , 


where  0  <  cp  <  2tt  and  r  >  0.  The  range  —it  <  cp  <  tt  is  also  often  used. 
In  the  converse  direction  the  following  conversion  formulas  are  valid 


r  —  yjx2  +y2  , 
y 

Lp  =  arctan  —  (in  the  region  r  >  0;  —?<<£<?), 

v  z  z 

v 

( p  =  sign  y  •  arccos  =  (if  y  ^  0  or  x  >  0;  —i r  <  cp  <  tt). 

\J x2  +  y2 

The  reader  is  encouraged  to  verify  these  formulas  with  the  help  of  maple . 


3.4  Exercises 

1.  Using  geometric  considerations  at  suitable  right-angled  triangles,  determine  the 
values  of  the  sine,  cosine  and  tangent  of  the  angles  a  =  45°,  j3  =  60°,  7  =  30°. 
Extend  your  result  for  a  =  45°  to  the  angles  135°,  225°,  —45°  with  the  help  of 
the  unit  circle.  What  are  the  values  of  the  angles  under  consideration  in  radian 
measure? 

2.  Using  MATLAB  write  a  function  degrad. m  which  converts  degrees  to  radian 
measure.  The  command  degrad  (18  0)  should  give  tt  as  a  result.  Further¬ 
more,  write  a  function  mys  in .  m  which  calculates  the  sine  of  an  angle  in  radian 
measure  with  the  help  of  degrad .  m. 
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Fig.  3.15  Proof  of 
Proposition  3.3 


3.  Prove  the  addition  theorem  of  the  sine  function 

sin(v  +  y )  =  sin  v  cos  y  +  cos  v  sin  y . 

Hint.  If  the  angles  x,  y  and  their  sum  x  +  y  are  between  0  and  7t/2  you  can 
directly  argue  with  the  help  of  Fig.  3.15;  the  remaining  cases  can  be  reduced  to 
this  case. 

4.  Prove  the  law  of  cosines 


a2  =  b2  -\-  c2  —  2 be  cos  a 
for  the  general  triangle  in  Fig.  3.4. 

Hint.  The  segment  c  is  divided  into  two  segments  c\  (left)  and  C2  (right)  by  the 
height  h.  The  following  identities  hold  true  by  Pythagoras’  theorem 

a2  =  h2  +  c2,  b2  =  h2  +  c2,  c  =  c\  +  c 2. 

Eliminating  h  gives  a2  =  b2  +  c2  —  2cc\. 

5.  Compute  the  angles  a,  (3,  7  of  the  triangle  with  sides  a  =  3,  b  =  4,  c  =  2  and 
plot  the  triangle  in  maple . 

Hint.  Use  the  law  of  cosines  from  Exercise  4. 

6.  Prove  the  law  of  sines 

a  b  c 

sin  a  sin  /3  sin  7 

for  the  general  triangle  in  Fig.  3.4. 

Hint.  The  first  identity  follows  from 

•  R  k 
sin  p  =  — . 

a 


h 

sin  a  =  - 
b 


3.4  Exercises 
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Fig.  3.1 6  Right  circular 
truncated  cone  with  unrolled 
surface 


7.  Compute  the  missing  sides  and  angles  of  the  triangle  with  data  b  =  5,  a  =  43°, 
7  =  62°,  and  plot  your  solutions  using  MATLAB. 

Hint.  Use  the  law  of  sines  from  Exercise  6. 

8.  With  the  help  of  MATLAB  plot  the  following  functions 

y  =  cos(arccosx),  v  e  [—  1,  1]; 

y  =  arccos(cosx),  v  6  [0,  tt] ; 

y  =  arccos(cosx),  v  e  [0,  4tt] . 

Why  is  arccos(cosx)  /  v  in  the  last  case? 

9.  Plot  the  functions  y  =  sin  x,y  =  |sinx|,y  =  sin 2  x,y  =  sin3v,y  =  ^  ( |  sin  jc  |  — 

sinx)  and  y  =  arcsin  (^(|sinv|  —  sinx))  in  the  interval  [0,  6tt].  Explain  your 
results. 

Hint.  Use  the  MATLAB  command  axis  equal. 

10.  Plot  the  graph  of  the  function  /  :  R  ->  R  :  x  i->  ax  +  sinx  for  various  values 
of  a.  For  which  values  of  a  is  the  function  /  injective  or  surjective? 

11.  Show  that  the  following  formulas  for  the  surface  line  s  and  the  surface  area  M 
of  a  right  circular  truncated  cone  (see  Fig.  3.16,  left)  hold  true 

s  =  ^/h2  +  (R  —  r )2,  M  =  7T  (r  +  R)s. 

Hint.  By  unrolling  the  truncated  cone  a  sector  of  an  annulus  with  apex  angle 
a  is  created;  see  Fig.  3.16,  right.  Therefore,  the  following  relationships  hold: 
at  =  2n r,  a(s  +  t)  =  2nR  and  M  =  +  t)2  —  f2). 

12.  The  secant  and  cosecant  functions  are  defined  as  the  reciprocals  of  the  cosine 
and  the  sine  functions,  respectively, 

1  1 

sec  a  =  - ,  esc  a  =  — - . 

cos  a  sin  a 

Due  to  the  zeros  of  the  cosine  and  the  sine  function,  the  secant  is  not  defined  for 
odd  multiples  of  |,  and  the  cosecant  is  not  defined  for  integer  multiples  of  tt. 

(a)  Prove  the  identities  1  +  tan2  a  =  sec2  a  and  1  +  cot2  a  =  esc2  a. 

(b)  With  the  help  of  MATLAB  plot  the  graph  of  the  functions  y  =  secx  and 
y  =  esc  x  for  x  between  —  2tt  and  2tt. 


® 
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Complex  Numbers 


Complex  numbers  are  not  just  useful  when  solving  polynomial  equations  but  play 
an  important  role  in  many  fields  of  mathematical  analysis.  With  the  help  of  complex 
functions  transformations  of  the  plane  can  be  expressed,  solution  formulas  for  dif¬ 
ferential  equations  can  be  obtained,  and  matrices  can  be  classified.  Not  least,  fractals 
can  be  defined  by  complex  iteration  processes.  In  this  section  we  introduce  complex 
numbers  and  then  discuss  some  elementary  complex  functions,  like  the  complex 
exponential  function.  Applications  can  be  found  in  Chaps.  9  (fractals),  20  (systems 
of  differential  equations)  and  in  Appendix  B  (normal  form  of  matrices). 


4.1  The  Notion  of  Complex  Numbers 

The  set  of  complex  numbers  C  represents  an  extension  of  the  real  numbers,  in  which 
the  polynomial  z2  +  1  has  a  root.  Complex  numbers  can  be  introduced  as  pairs  ( a ,  b) 
of  real  numbers  for  which  addition  and  multiplication  are  defined  as  follows: 

(a,  b)  +  (c,  d)  =  {a  +  c,  b  +  d), 

(a,  b)  •  (c,  d)  =  ( ac  —  bd ,  ad  +  be). 

The  real  numbers  are  considered  as  the  subset  of  all  pairs  of  the  form  (a,  0),  a  e  R. 
Squaring  the  pair  (0,  1)  shows  that 


(0,  1)  •  (0,  1)  =  C— 1,0). 


The  square  of  (0,  1)  thus  corresponds  to  the  real  number  —1.  Therefore,  (0,  1)  pro¬ 
vides  a  root  for  the  polynomial  z2  +  1.  This  root  is  denoted  by  i;  in  other  words 

i2  =  -l. 
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4  Complex  Numbers 


Using  this  notation  and  rewriting  the  pairs  (a,  b)  in  the  form  a  +  i b,  one  obtains  a 
computationally  more  convenient  representation  of  the  set  of  complex  numbers: 

C  =  {a  +  ib  ;  a  e  R,  b  6  R}. 

The  rules  of  calculation  with  pairs  ( a ,  Z?)  then  simply  amount  to  the  common  cal¬ 
culations  with  the  expressions  a  +  ib  like  with  terms  with  the  additional  rule  that 
i2  =  -1: 


( a  T-  ib')  H-  (c  id)  —  ci  -\-  c  -\-  i (b  -f  d ), 

(a  +  i b)(c  +  id)  =  ac  +  i be  +  i ad  +  i  b d 

=  ac  —  bd  +  i  (ad  +  be). 


So,  for  example, 


(2  +  3i)( —  1  +  i)  =  — 5  —  i. 


Definition  4.1  For  the  complex  number  z  =  x  +  iy, 

x  =  R ez,  \'  =  Im  ^ 

denote  the  real  part  and  the  imaginary  part  of  z,  respectively.  The  real  number 

\z\  =  yjx2  +  y2 

is  the  absolute  value  (or  modulus)  of  z,  and 


z  =  x  -  iy 


is  the  complex  conjugate  to  z. 


A  simple  calculation  shows  that 


zz  =  (x+  i y)(x  -  iy)  =  x2  +  y1  =  |z|2, 


which  means  that  zz  is  always  a  real  number.  From  this  we  obtain  the  rule  for 
calculating  with  fractions 


u  +  iv 
x  +iy 


u  +  iv 


x  -  i  y 


x  +  i y  )  \x  -  iy 


(j u  +  iv)(x  —  iy) 
x2  +  y2 


ux  +  vy  vx  —  uy 

+  i 


It  is  achieved  by  expansion  with  the  complex  conjugate  of  the  denominator.  Appar¬ 
ently  one  can  therefore  divide  by  any  complex  number  not  equal  to  zero,  and  the  set 
C  forms  di  field. 


4.1  The  Notion  of  Complex  Numbers 
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Experiment  4.2  Type  in  MATLAB:  z  =  complex  (2,3)  (equivalently  z  =  2  +  3  *i 
orz  =  2  +  3*j)as  well  as  w  =  complex  (-1,1)  and  try  out  the  commands  z  *  w, 
z  /w  as  well  as  real  ( z ) ,  imag  ( z ) ,  con j  ( z ) ,  abs  ( z ) . 

Clearly  every  negative  real  v  has  two  square  roots  in  C,  namely  i^f\x\  and  — i *f\x\. 
More  than  that  the  fundamental  theorem  of  algebra  says  that  C  is  algebraically 
closed.  Thus  every  polynomial  equation 

anzn  +  0Ln-\ Zn~l  •  •  •  +  ofiz  +  ao  =  0 

with  coefficients  a  j  e  C,  an  /  0  has  n  complex  solutions  (counted  with  their 
multiplicity). 

Example  4.3  (Taking  the  square  root  of  complex  numbers)  The  equation  z2  =  a  + 
lb  can  be  solved  by  the  ansatz 


(X  +  ij)2 


=  a  +  lb 


so 

x2  —  y2  =  a,  2 xy  =  b. 

If  one  uses  the  second  equation  to  express  y  through  v  and  substitutes  this  into  the 
first  equation,  one  obtains  the  quartic  equation 

x4  —  ax2  —  b2  /  4  —  0. 

Solving  this  by  substitution  t  =  x2  one  obtains  the  two  real  solutions.  In  the  case  of 
b  =  0,  either  v  or  y  equals  zero  depending  on  the  sign  of  a. 

The  complex  plane.  A  geometric  representation  of  the  complex  numbers  is  obtained 
by  identifying  z  =  x  +  iy  e  C  with  the  point  (x,  y)  e  Mr  in  the  coordinate  plane 
(Fig. 4.1).  Geometrically  \z\  =  V ' x2  +  y2  is  the  distance  of  point  (x,  y)  from  the 
origin;  the  complex  conjugate  z  =  x  —  iy  is  obtained  by  reflection  in  the  v-axis. 

The  polar  representation  of  a  complex  number  z  =  x  +  iy  is  obtained  like  in 
Application  3.4  by 

r  =  |z|,  cp  =  ar gz. 


i  V 


Fig.  4.1  Complex  plane 


z  —  x  +  iy 
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The  angle  cp  to  the  positive  v-axis  is  called  argument  of  the  complex  number,  where¬ 
upon  the  choice  of  the  interval  —  n  <  cp  <  tt  defines  the  principal  value  Arg  z  of  the 
argument.  Thus 

z  =  x  +  iy  =  r(  cos  cp  +  i  sin  cp). 

The  multiplication  of  two  complex  numbers  z  =  r  (cos  cp  +  i  sin  (p),  w  =  s( cos  i/a  + 
i  sin  t/a)  in  polar  representation  corresponds  to  the  product  of  the  absolute  values  and 
the  sum  of  the  angles: 


zw  =  rs (cos (cp  +  t/a)  +  i  sin(<^  +  t/a))  , 

which  follows  from  the  addition  formulas  for  sine  and  cosine: 

sin(<^  +  t/a)  =  sin  <p  cos  i/a  +  cos  (p  sin  i/a, 
cos(<p  +  t/a)  =  cos  (p  cos  \f/  —  sin  <p  sin  i/a, 


see  Proposition  3.3. 


4.2  The  Complex  Exponential  Function 

An  important  tool  for  the  representation  of  complex  numbers  and  functions,  but  also 
for  the  real  trigonometric  functions,  is  given  by  the  complex  exponential  function. 
For  z  =  x  -\-iy  this  function  is  defined  by 

ez  =  qx  (cos  y  +  i  sin  y ) . 

The  complex  exponential  function  maps  C  to  C  \  {0}.  We  will  study  its  mapping 
behaviour  below.  It  is  an  extension  of  the  real  exponential  function;  i.e.  if  z  =  x  el, 
then  qz  =  eA .  This  is  in  accordance  with  the  previously  defined  real-valued  expo¬ 
nential  function.  We  also  use  the  notation  exp (z)  for  ez. 

The  addition  theorems  for  sine  and  cosine  imply  the  usual  rules  of  calculation 

e,z+w  =  ezew,  e°  =  1,  (ez)n=enz, 

valid  for  z,  w  e  C  and  n  e  Z.  In  contrast  to  the  case  when  z  is  a  real  number,  the  last 
rule  (for  raising  to  powers)  is  generally  not  true,  if  n  is  not  an  integer. 

Exponential  function  and  polar  coordinates.  According  to  the  definition  the  expo¬ 
nential  function  of  a  purely  imaginary  number  icp  equals 

e1^  =  cos  cp  +  isin<p, 
le1^  |  =  yj cos2  (p  +  sin2  cp  =  1. 
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Fig.  4.2  The  unit  circle  in 
the  complex  plane 


Thus  the  complex  numbers 

{e1^  ;  —7 r  <  ip  <  tx} 

lie  on  the  unit  circle  (Fig.  4.2). 

For  example,  the  following  identities  hold: 

ei?r/2  =  i,  ei7r  =  -1,  e2m  =  1,  e2kin  =  1  (k  e  Z). 

Using  r  =  \z\,<p  =  Arg  z  results  in  the  especially  simple  form  of  the  polar  represen¬ 
tation 

z  =  re1^. 


Taking  roots  is  accordingly  simple. 

Example  4.4  (Taking  square  roots  in  complex  polar  coordinates)  If  z 2  =  re1^,  then 
one  obtains  the  two  solutions  ±^/r  Ql(p/ 2  for  z.  For  example,  the  problem 

z2  =  2i  =  2  ei7r/2 

has  the  two  solutions 

z  =  V 2e“/4  =  l+i 
and 

z  =  — V 2ei7r/4  = -1 -i. 

Euler’s  formulas.  By  addition  and  subtraction,  respectively,  of  the  relations 

el(p  =  cos  (p  +  isin<^, 

Q~1(p  =  cos  ip  —  i  sin  ip, 
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one  obtains  at  once  Euler’s  formulas 


cos  (p  =  -  (e1^  +  e  l(p ), 

1 

sin  (p  =  — 

2i 

They  permit  a  representation  of  the  real  trigonometric  functions  by  means  of  the 
complex  exponential  function. 


4.3  Mapping  Properties  of  Complex  Functions 

In  this  section  we  study  the  mapping  properties  of  complex  functions.  More  precisely, 
we  ask  how  their  effect  can  be  described  geometrically.  Let 

/:DcC^C:m  w  =  f(z) 

be  a  complex  function,  defined  on  a  subset  D  of  the  complex  plane.  The  effect  of  the 
function  /  can  best  be  visualised  by  plotting  two  complex  planes  next  to  each  other, 
the  z-plane  and  the  u;-plane,  and  studying  the  images  of  rays  and  circles  under  /. 

Example  4.5  The  complex  quadratic  function  maps  D  =  C  to  C  :  w  =  z2-  Using 
polar  coordinates  one  obtains 


z  =  x  +  iy  =  r  e1(p  =>-  w  =  u  +  iv  =  r2e2l(p. 

From  this  representation  it  can  be  seen  that  the  complex  quadratic  function  maps  a 
circle  of  radius  r  in  the  z -plane  onto  a  circle  of  radius  r2  in  the  w -plane.  Further,  it 
maps  half-rays 

[z  =  re1^  :  r  >  0} 

with  the  angle  of  inclination  i/r  onto  half-rays  with  angle  of  inclination  2^  (Fig.  4.3). 

Particularly  important  are  the  mapping  properties  of  the  complex  exponential 
function  w  =  tz  because  they  form  the  basis  for  the  definition  of  the  complex  loga¬ 
rithm  and  the  root  functions.  If  z  =  x  +  iy  then  ez  =  ex  (cos  y  +  i  sin  y).  We  always 
have  that  eA  >0;  furthermore  cos  y  +  i  sin  y  defines  a  point  on  the  complex  unit 
circle  which  is  unique  for  —  n  <  y  <  n .  If  x  moves  along  the  real  line  then  the 
points  eA  (cosy  +  isiny)  form  a  half-ray  with  angle  y,  as  can  be  seen  in  Fig.  4.4. 
Conversely,  if  x  is  fixed  and  y  varies  between  —  n  and  n  one  obtains  the  circle  with 
radius  eA  in  the  u;-plane.  For  example,  the  dotted  circle  (Fig.  4.4,  right)  is  the  image 
of  the  dotted  straight  line  (Fig.  4.4,  left)  under  the  exponential  function. 
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Fig.  4.3  The  complex  quadratic  function 


Fig.  4.4  The  complex  exponential  function 


From  what  has  just  been  said  it  follows  that  the  exponential  function  is  bijective 
on  the  domain 

D  =  {z  =  x  +  iy  ;  xgR,  —tv  <  y  <  n]  B  =  C  \  {0}. 

It  thus  maps  the  strip  of  width  In  onto  the  complex  plane  without  zero.  The  argument 
of  e^  exhibits  a  jump  along  the  negative  u- axis  as  indicated  in  Fig.  4.4  (right).  Within 
the  domain  D  the  exponential  function  has  an  inverse  function,  the  principal  branch 
of  the  complex  logarithm.  From  the  representation  w  =  ez  =  eA  elv  one  derives  at 
once  the  relation  v  =  log  \  w\,  y  =  Arg  w.  Thus  the  principal  value  of  the  complex 
logarithm  of  the  complex  number  w  is  given  by 

z  =  Log  w  =  log  |  w  |  +  i  Arg  w 


and  in  polar  coordinates 

Log  (r  e1^)  =  log  r  +  i cp,  —n  <  (p  <  tv, 


respectively. 

With  the  help  of  the  principal  value  of  the  complex  logarithm,  the  principal  values 
of  the  nth  complex  root  function  can  be  defined  by  ^fz  =  exp  Log(z)) . 
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Experiment  4.6  Open  the  applet  2D  visualisation  of  complex  functions  and  inves¬ 
tigate  how  the  power  functions  w  =  zn ,  n  e  N,  map  circles  and  rays  of  the  complex 
plane.  Set  the  pattern  polar  coordinates  and  experiment  with  different  sectors  (inter¬ 
vals  of  the  argument  [a,  fi]  with  0<a  <  fi  <2tz). 

Experiment  4.7  Open  the  applet  2D  visualisation  of  complex  functions  and  investi¬ 
gate  how  the  exponential  function  w  =  ez  maps  horizontal  and  vertical  straight  lines 
of  the  complex  plane.  Set  the  pattern  grid  and  experiment  with  different  strips,  for 
example  1  <  Rez  <  2,  —  2  <  Imz  <  2. 


4.4  Exercises 

1.  Compute  Re  z,  Im  z,  z  and  |z|  for  each  of  the  following  complex  numbers  z: 

1  +  i  1  1  —  2i 

z  —  3  +  2i,  z  =  - 1,  z  =  - — z  —  3-i+- — z—-A — —  • 

2  —  i  3  —  i  4  —  3i 

Perform  these  calculations  in  MATLAB  as  well. 

2.  Rewrite  the  following  complex  numbers  in  the  form  z  =  rtl(p  and  sketch  them 
in  the  complex  plane: 

z  —  —  1  —  i,  z  —  —5,  z  —  3i,  z  =  2  —  2i,  z  =  1  —  ix/3. 

What  are  the  values  of  (p  in  radian  measure? 

3.  Compute  the  two  complex  solutions  of  the  equation 

z2  =  2  +  2i 

with  the  help  of  the  ansatz  z  =  x  +  iy  and  equating  the  real  and  the  imaginary 
parts.  Test  and  explain  the  MATLAB  commands 

roots  (  [2 , 0 ,  -2  -  2  *i]  ) 
sqrt  (2  +2  *i ) 

4.  Compute  the  two  complex  solutions  of  the  equation 

z2  =  2  +  2i 

in  the  form  z  =  relcp  from  the  polar  representation  of  2  +  2i. 

5.  Compute  the  four  complex  solutions  of  the  quartic  equation 

z4  -  2z2  +  2  =  0 


by  hand  and  with  MATLAB  (command  roots). 
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6.  Let  z  =  x  +  iy,  w  =  u  +  iv.  Check  the  formula  ez+Wj  =  e^e^  by  using  the  def¬ 
inition  and  applying  the  addition  theorems  for  the  trigonometric  functions. 

7.  Compute  z  =  Log  w  for 


w  =  1  +  i,  w  =  — 5i,  w  =  —  1. 


Sketch  w  and  z  in  the  complex  plane  and  verify  your  results  with  the  help  of  the 
relation  w  =  ez  and  with  MATLAB  (command  log). 

8.  The  complex  sine  and  cosine  functions  are  defined  by 


sin  z  = 


cos  z  = 


) 


for  z  e  C. 

(a)  Show  that  both  functions  are  periodic  with  period  2tt,  that  is  sin  (z  +  2tt)  = 
sku,  cos (z  +  2tt)  =  cos  z. 

(b)  Verify  that,  for  z  =  x  +  iy, 

sin z  =  sinx  coshy  +  icosx  sinhy,  cos z  =  cos x  coshy  —  isinx  sinhy. 

(c)  Show  that  sin  z  =  0  if  and  only  if  z  =  kit ,  k  e  Z,  and  cos  z  =  0  if  and  only 
if  z  =  (k  +  j)7r,  k  e  Z. 


® 
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Sequences  and  Series 


The  concept  of  a  limiting  process  at  infinity  is  one  of  the  central  ideas  of  mathematical  analysis. 
It  forms  the  basis  for  all  its  essential  concepts,  like  continuity,  differentiability,  series  expansions 
of  functions,  integration,  etc.  The  transition  from  a  discrete  to  a  continuous  setting  constitutes  the 
modelling  strength  of  mathematical  analysis.  Discrete  models  of  physical,  technical  or  economic 
processes  can  often  be  better  and  more  easily  understood,  provided  that  the  number  of  their  atoms — 
their  discrete  building  blocks — is  sufficiently  big,  if  they  are  approximated  by  a  continuous  model 
with  the  help  of  a  limiting  process.  The  transition  from  difference  equations  for  biological  growth 
processes  in  discrete  time  to  differential  equations  in  continuous  time  are  examples  for  that,  as  is 
the  description  of  share  prices  by  stochastic  processes  in  continuous  time.  The  majority  of  models 
in  physics  ar e  field  models ,  that  is,  they  are  expressed  in  a  continuous  space  and  time  structure. 
Even  though  the  models  are  discretised  again  in  numerical  approximations,  the  continuous  model 
is  still  helpful  as  a  background,  for  example  for  the  derivation  of  error  estimates. 

The  following  sections  are  dedicated  to  the  specification  of  the  idea  of  limiting  processes.  This 
chapter  starts  by  studying  infinite  sequences  and  series,  gives  some  applications  and  covers  the 
corresponding  notion  of  a  limit.  One  of  the  achievements  which  we  especially  emphasise  is  the 
completeness  of  the  real  numbers.  It  guarantees  the  existence  of  limits  for  arbitrary  monotonically 
increasing  bounded  sequences  of  numbers,  the  existence  of  zeros  of  continuous  functions,  of  maxima 
and  minima  of  differentiable  functions,  of  integrals,  etc.  It  is  an  indispensable  building  block  of 
mathematical  analysis. 


5.1  The  Notion  of  an  Infinite  Sequence 

Definition  5.1  Let  X  be  a  set.  An  ( infinite )  sequence  with  values  in  X  is  a  mapping 
from  N  to  X. 
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Fig.  5.1  Graph  of  a  sequence 


Thus  each  natural  number  n  ( the  index )  is  mapped  to  an  element  an  of  X  (the  nth 
term  of  the  sequence).  We  express  this  by  using  the  notation 

(f^n)n>  1  =  (^1?  ^2?  ^3?  •  •  •)• 

In  the  case  of  X  =  R  one  speaks  of  real-valued  sequences,  if  X  =  C  of  complex¬ 
valued  sequences,  if  X  =  Wn  of  vector-valued  sequences.  In  this  section  we  only 
discuss  real-valued  sequences. 

Sequences  can  be  added 


iftn)n>  1  T  (f>n)n>  1  —  (&n  T  bn)n>  1 
and  multiplied  by  a  scalar  factor 


1  —  (^dn)n>  1- 

These  operations  are  performed  componentwise  and  endow  the  set  of  all  real-valued 
sequences  with  the  structure  of  a  vector  space.  The  graph  of  a  sequence  is  visualised 
by  plotting  the  points  (n,an),n  =  1,  2,  3,  ...  in  a  coordinate  system,  see  Fig.  5.1. 

Experiment  5.2  The  M-file  mat05_la  .m  offers  the  possibility  to  study  various 
examples  of  sequences  which  are  increasing/decreasing,  bounded/unbounded,  oscil¬ 
lating,  convergent.  For  a  better  visualisation  the  discrete  points  of  the  graph  of  the 
sequence  are  often  connected  by  line  segments  (exclusively  for  graphical  purpose) — 
this  is  implemented  in  the  M-file  mat  0  5_lb .  m.  Open  the  applet  Sequences  and  use 
it  to  illustrate  the  sequences  given  in  the  M-file  mat  0  5_la  .  m. 

Sequences  can  either  be  defined  explicitly  by  a  formula,  for  instance 

r\n 

an  =  2  , 

or  recursively  by  giving  a  starting  value  and  a  rule  how  to  calculate  a  term  from  the 
preceding  one, 

a  i  =  l,  ayi-\-\  —  2an. 

The  recursion  can  also  involve  several  previous  terms  at  a  time. 
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Example  5.3  A  discrete  population  model  which  goes  back  to  Verhulst  (limited 
growth)  describes  the  population  xn  at  the  point  in  time  n  (using  time  intervals  of 
length  1)  by  the  recursive  relation 


■*71+1 


—  “1“  (3%n  (T 


Here  (3  is  a  growth  factor  and  L  the  limiting  population,  i.e.  the  population  which 
is  not  exceeded  in  the  long-term  (short-term  overruns  are  possible,  however,  lead 
to  immediate  decay  of  the  population).  Additionally  one  has  to  prescribe  the  initial 
population  x\  =  A.  According  to  the  model  the  population  increase  ;*++l  —  ^during 
one  time  interval  is  proportional  to  the  existing  population  and  to  the  difference  to 
the  population  limit.  The  M-file  mat  0  5_2  .  m  contains  a  MATLAB  function,  called  as 


x  =  mat05_2 ( A, beta , N) 


which  computes  and  plots  the  first  N  terms  of  the  sequence  v  =  (x\,  . . . ,  xn).  The 
initial  value  is  A,  the  growth  rate  /?;  L  was  set  to  L  =  1.  Experiments  with  A  =  0.1, 
N  =  50  and  [3  =  0.5,  /?  =  1,  /?  =  2,  f3  =  2.5,  (3  =  3  show  convergent,  oscillating 
and  chaotic  behaviour  of  the  sequence,  respectively. 


Below  we  develop  some  concepts  which  help  to  describe  the  behaviour  of 
sequences. 

Definition  5.4  A  sequence  ( an)n>\  is  called  monotonically  increasing ,  if 

n  <  m  =>  an  <  am; 

(an)n> l  is  called  monotonically  decreasing ,  if 

n  <  m  =>  an  >  am; 
i^n)n>  l  is  called  bounded  from  above ,  if 

3T  e  M  Vn  e  N  :  an  <  T. 


We  will  show  in  Proposition  5. 13  below  that  the  set  of  upper  bounds  of  a  bounded 
sequence  has  a  smallest  element.  This  least  upper  bound  To  is  called  the  supremum 
of  the  sequence  and  denoted  by 


To  =  sup  an . 

neN 


1  P.-F.  Verhulst,  1804-1849. 
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The  supremum  is  characterised  by  the  following  two  conditions: 

(a)  an  <  7o  for  all  n  e  N; 

(b)  if  T  is  a  real  number  and  an  <  T  for  all  n  e  N,  then  T  >  7o. 

Note  that  the  supremum  itself  does  not  have  to  be  a  term  of  the  sequence.  However, 
if  this  is  the  case,  it  is  called  maximum  of  the  sequence  and  denoted  by 


7o  =  max  an . 

ne  N 

A  sequence  has  a  maximum  7b  if  the  following  two  conditions  are  fulfilled: 

(a)  an  <  7b  for  all  n  e  N; 

(b)  there  exists  at  least  one  me  N  such  that  am  =  Tq. 

In  the  same  way,  a  sequence  (an)n>  \  is  called  bounded  from  below ,  if 

3S  e  R  Vn  e  N  :  S  <  an. 

The  greatest  lower  bound  is  called  infimum  (or  minimum ,  if  it  is  attained  by  a  term 
of  the  sequence). 

Experiment  5.5  Investigate  the  sequences  produced  by  the  M-file  mat05_la  .m 
with  regard  to  the  concepts  developed  above. 

As  mentioned  in  the  introduction  to  this  chapter,  the  concept  of  convergence  is 
a  central  concept  of  mathematical  analysis.  Intuitively  it  states  that  the  terms  of  the 
sequence  (an)n>\  approach  a  limit  a  with  growing  index  n.  For  example,  in  Fig.  5.2 
with  a  =  0.8  one  has 

\a  —  an\  <  0.2  from  n  =  6,  \a  —  an\  <  0.05  from  n  =  21. 
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Fig.  5.2  Convergence  of  a  sequence 
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For  a  precise  definition  of  the  concept  of  convergence  we  first  introduce  the  notion 
of  an  e-neighbourhood  of  a  point  a  e  R  (e  >  0): 

U£(a)  =  {x  e  R  ;  \a  —  x\  <  e}  =  (a  —  e,  a  +  e). 

We  say  that  a  sequence  (an)n> l  settles  in  a  neighbourhood  U£(a),  if  from  a  certain 
index  on  all  subsequent  terms  an  of  the  sequence  lie  in  U£(a). 

Definition  5.6  The  sequence  (an)n>\  converges  to  a  limit  a  if  it  settles  in  each 
^-neighbourhood  of  a . 

These  facts  can  be  expressed  in  quantifier  notation  as  follows: 

We  >  0  3n(e)  e  N  Wn  >  n(e)  :  \a  —  an\  <  e. 

If  a  sequence  ( an)n>\  converges  to  a  limit  a ,  one  writes 

a  —  lim  an  or  an  —>  a  as  n  —>  oo. 

n^oo 

In  the  example  of  Fig.  5.2  the  limit  a  is  indicated  as  a  dotted  line,  the  neighbourhood 
U0.2 (a)  as  a  strip  with  a  dashed  boundary  line  and  the  neighbourhood  Uomos(a)  as  a 
strip  with  a  solid  boundary  line. 

In  the  case  of  convergence  the  limit  can  be  interchanged  with  addition,  multipli¬ 
cation  and  division  (with  the  exception  of  zero),  as  expected. 

Proposition  5.7  (Rules  of  calculation  for  limits)  If  the  sequences  (an)n>\  and 
(bn)n>  l  are  convergent  then  the  following  rules  hold: 

lim  (an  +  bn)  =  lim  an  +  lim  bn 

n^oo  n^oo  n^oo 

lim  (A an)  =  A  lim  an  (for  AgM) 

n^oo  n^oo 

lim  ( anbn )  =  (  lim  an)(  lim  bn) 

n—>  oo  n — >  oo  n—^oo 

lim  ( anlbn )  =  (  lim  an)/(  lim  b„)  (if  lim  bn  0) 

n  —>  oo  n — >  oo  n—>oc  n—>oc 


Proof  The  verification  of  these  trivialities  is  left  to  the  reader  as  an  exercise.  The 
proofs  are  not  deep,  but  one  has  to  carefully  pick  the  right  approach  in  order  to  verify 
the  conditions  of  Definition 5.6.  In  order  to  illustrate  at  least  once  how  such  proofs 
are  done,  we  will  show  the  statement  about  multiplication.  Assume  that 

lim  an  =  a  and  lim  bn  =  b. 

n^oo  n^oo 

Let  e  >  0.  According  to  Definition 5.6  we  have  to  find  an  index  n(e)  e  N  satisfying 

| ab  —  anbn\  <  e 
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for  all  n  >n(e).  Due  to  the  convergence  of  the  sequence  (< an)n>\  we  can  first  find 
an  n\ (e)  e  N  so  that  \a  —  an\  <  1  for  all  n  >n\(e).  For  these  n  it  also  applies  that 

\an\  =  \an  —  a  +  a\  <  1  +  \a\. 


Furthermore,  we  can  find  rt2(e)  G  N  and  n^ie)  e  N  which  guarantee  that 


£ 


\a  —  an\  < 


2max(|Z?|,  1) 


and  \b  —  bn\  < 


£ 


2(1  +  \a\) 


for  all  n  >  n2(e)  and  n  >  723 (er),  respectively.  It  thus  follows  that 


I  ab  -  anbn\  =  \(a  -  an)b  +  an(b  -  bn)\  <  |  a-  an\\b\  +  \an\\b  -  bn 
<  \a  —  aw||Z?|  +  (|«|  +  1)| b  —  bn |  <—  +  —<£ 


for  all  n  >  n  (e)  with  n  (e)  =  ma x(n  1  (e) ,  ri2  (e) ,  ^3  (e)) .  This  is  the  statement  that  was 
to  be  proven.  □ 


The  important  ideas  of  the  proof  were:  Splitting  in  two  summands  with  the  help  of 
the  triangle  inequality  (see  Exercise  2  of  Chap.  1);  bounding  \an  \  by  1  +  \a  \  using  the 
assumed  convergence;  upper  bounds  for  the  terms  \a  —  an  \  and  \b  —  bn  \  by  fractions 
of  e  (again  possible  due  to  the  convergence)  so  that  the  summands  together  stay  less 
than  e.  All  elementary  proofs  of  convergence  in  mathematical  analysis  proceed  in  a 
similar  way. 

Real-valued  sequences  with  terms  that  increase  to  infinity  with  growing  index  n 
have  no  limit  in  the  sense  of  the  definition  given  above.  However,  it  is  practical  to 
assign  them  the  symbol  00  as  an  improper  limit. 

Definition  5.8  A  sequence  ( an)n>\  has  the  improper  limit  00  if  it  has  the  property 
of  unlimited  increase 


VT  6  R  3n(T)  G  N  Vn  >  n(T)  :  an>T. 
In  this  case  one  writes 


lim  on  =  00. 

n^oo 


In  the  same  way  one  defines 

lim  bn  =  — 


00,  if  lim  (—bn)  =  00. 
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Example  5.9  We  consider  the  geometric  sequence  (qn)n> l-  It  obviously  holds  that 

lim  qn  =0,  if  \q\  <  1, 

n — >oo 

lim  qn  =  co,  if  q  >  1, 

n — >oo 

lim  qn  =  1,  if  q  =  1. 

n—>oo 

For  g  <  —  1  the  sequence  has  no  limit  (neither  proper  nor  improper). 


5.2  The  Completeness  of  the  Set  of  Real  Numbers 

As  remarked  in  the  introduction  to  this  chapter,  the  completeness  of  the  set  of  real 
numbers  is  one  of  the  pillars  of  real  analysis.  The  property  of  completeness  can  be 
expressed  in  different  ways.  We  will  use  a  simple  formulation  which  is  particularly 
helpful  in  many  applications. 

Proposition  5.10  (Completeness  of  the  set  of  real  numbers)  Each  monotonically 
increasing  sequence  of  real  numbers  that  is  bounded  from  above  has  a  limit  (in  R). 

Proof  Let  ( an)n>\  be  a  monotonically  increasing,  bounded  sequence.  First  we  prove 
the  theorem  in  the  case  that  all  terms  an  are  non-negative.  We  write  the  terms  as 
decimal  numbers 

an  = 

with  A{n)  e  No,  otp  €  {0,  1,  . . . ,  9}.  By  assumption  there  is  a  bound  T  ^  0  so  that 

an  <  T  for  all  n.  Therefore,  also  A(/7)  <  T  for  all  n.  But  the  sequence  (A^)n>\ 
is  a  monotonically  increasing,  bounded  sequence  of  integers  and  therefore  must 
eventually  reach  its  least  upper  bound  A  (and  stay  there).  In  other  words,  there  exists 
no  e  N  such  that 

A^b  —  a  for  ap  n  >  no. 

Thus  we  have  found  the  integer  part  of  the  limit  a  to  be  constructed: 

a  —  A.  ... 

Let  now  a\  e  {0,  . . . ,  9}  be  the  least  upper  bound  for  off  \  As  the  sequence  is  mono¬ 
tonically  increasing  there  is  again  an  n  \  e  N  with 

=  a\  for  all  n  >  n\ 
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and  consequently 


a i  —  A .  Qq  ... 

(n) 

Let  now  ot2  6  {0,  . . . ,  9}  be  the  least  upper  bound  for  a2  .  There  is  an  ri2  e  N  with 

ay  —  a2  for  all  n  >  ri2 


and  consequently 


=  A.  ce  1 0^2  •  •  • 

Successively  one  defines  a  real  number 


a  =  A.aio^a^aq.  •  •  • 

in  that  way.  It  remains  to  show  that  a  =  lim^oo^U-  Let  e  >  0.  We  first  choose 
j  e  N  so  that  10-J  <  e.  For  n  >  nj 

a  —  an  =  0.000  ...  0 

since  the  first  j  digits  after  the  decimal  point  in  a  coincide  with  those  of  an  provided 
n>nj.  Therefore, 


\a  —  an\  <  10  J  <  £  for  n  >  nj. 

With  n(e)  =  nj  the  condition  required  in  Definition 5.6  is  fulfilled. 

If  the  sequence  (an)n>  i  also  has  negative  terms,  it  can  be  transformed  to  a  sequence 
with  non-negative  terms  by  adding  the  absolute  value  of  the  first  term  which  results 
in  the  sequence  (\a\  \  +  an)n>\.  Using  the  obvious  rule  lim(c  +  an)  =  c  +  lim an 
allows  one  to  apply  the  first  part  of  the  proof.  □ 

Remark  5.11  The  set  of  rational  numbers  is  not  complete.  For  example,  the  decimal 
expansion  of  */2, 


(1,  1.4,  1.41,  1.414,  1.4142, ...) 

is  a  monotonically  increasing,  bounded  sequence  of  rational  numbers  (an  upper 
bound  is,  e.g.  T  =  1.5,  since  1.  52  >  2),  but  the  limit  \fl  does  not  belong  to  Q  (as  it 
is  an  irrational  number). 

Example  5. 12  (Arithmetic  of  real  numbers)  Due  to  Proposition  5. 10  the  arithmeti¬ 
cal  operations  on  the  real  numbers  introduced  in  Sect.  1.2  can  be  legitimised  a  poste¬ 
riori.  Let  us  look,  for  instance,  at  the  addition  of  two  non-negative  real  numbers 
a  =  A. a i a2  •  •  •  and  b  =  B./3\/32  . . .  with  A,  B  e  No,  ay,  (3j  6  {0,  1, ... ,  9}.  By 
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truncating  them  after  the  nth  decimal  place  we  obtain  two  approximating  sequences 
of  rational  numbers  an  =  A.a\a2  . . .  an  and  bn  =  B.Pifa  •  •  •  Pn  with 

a  =  lim  an ,  b  =  lim  bn. 

n^oo  n^oo 

The  sum  of  two  approximations  an  +  bn  is  defined  by  the  addition  of  rational  num¬ 
bers  in  an  elementary  way.  The  sequence  ( an  +  bn)n>\  is  evidently  monotonically 
increasing  and  bounded  from  above,  for  instance,  by  A  +  B  +  2.  According  to  Propo¬ 
sition  5. 10  this  sequence  has  a  limit  and  this  limit  defines  the  sum  of  the  real  numbers 

a  +  b  =  lim  (an  +  bn). 

n^oo 

In  this  way  the  addition  of  real  numbers  is  rigorously  justified.  In  a  similar  way 
one  can  proceed  with  multiplication.  Finally,  Proposition  5.7  allows  one  to  prove  the 
usual  rules  for  addition  and  multiplication. 

Consider  a  sequence  with  upper  bound  T .  Each  real  number  T\  >  T  is  also  an 
upper  bound.  We  can  now  show  that  there  always  exists  a  smallest  upper  bound.  A 
bounded  sequence  thus  actually  has  a  supremum  as  claimed  earlier. 

Proposition  5.13  Each  sequence  (an)n>\  of  real  numbers  which  is  bounded  from 
above  has  a  supremum. 

Proof  Let  Tn  =  max {<21,  . . . ,  an)  be  the  maximum  of  the  first  n  terms  of  the 
sequence.  These  maxima  on  their  part  define  a  sequence  (Tn)n>  1  which  is  bounded 
from  above  by  the  same  bounds  as  (an)n>\  but  is  additionally  monotonically  increas¬ 
ing.  According  to  the  previous  proposition  it  has  a  limit  7o.  We  are  going  to  show 
that  this  limit  is  the  supremum  of  the  original  sequence.  Indeed,  as  Tn  <  7o  for  all  n, 
we  have  an  <  7o  for  all  n  as  well.  Assume  that  the  sequence  (an)n>  1  had  a  smaller 
upper  bound  T  <  7o,  i.e.  an  <  T  for  all  n.  This  in  turn  implies  Tn  <  T  for  all  n  and 
contradicts  the  fact  that  7o  =  lim  Tn.  Therefore,  7o  is  the  least  upper  bound.  □ 

Application  5.14  We  are  now  in  a  position  to  show  that  the  construction  of  the 
exponential  function  for  real  exponents  given  informally  in  Sect.  2.2  is  justified. 
Let  a  >  0  be  a  basis  for  the  power  ar  to  be  defined  with  real  exponent  r  e  M.  It 
is  sufficient  to  treat  the  case  r  >  0  (for  negative  r,  the  expression  ar  is  defined 
by  the  reciprocal  of  <2^).  We  write  r  as  the  limit  of  a  monotonically  increasing 
sequence  (rn)n>\  of  rational  numbers  by  choosing  for  rn  the  decimal  representation 
of  r,  truncated  at  the  nth  digit.  The  rules  of  calculation  for  rational  exponents  imply 
the  inequality  ar,l+l  —  arn  =  arn  ( ar,l+l~rn  —  l)  >  0.  This  shows  that  the  sequence 
(artl)n>\  is  monotonically  increasing.  It  is  also  bounded  from  above,  for  instance, 
by  aq,  if  q  is  a  rational  number  bigger  than  r.  According  to  Proposition 5. 10  this 
sequence  has  a  limit.  It  defines  ar . 
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Application  5.15  Let  a  >  0.  Then  =  1. 

In  the  proof  we  can  restrict  ourselves  to  the  case  0  <  a  <  1  since  otherwise  the 
argument  can  be  used  for  l /a.  One  can  easily  see  that  the  sequence  i^/a)n> l  is 
monotonically  increasing;  it  is  also  bounded  from  above  by  1.  Therefore,  it  has  a 
limit  b.  Suppose  that  b  <  1.  From  Jtfa  <  b  we  infer  that  a  <  bn  — >  0  for  n  — >  oo, 
which  contradicts  the  assumption  a  >  0.  Consequently  b  =  1. 


5.3  Infinite  Series 

Sums  of  the  form 


oo 

^  ^  =  CL\  +  $2  +  ^3  +  •  •  • 

fc=l 


with  infinitely  many  summands  can  be  given  a  meaning  under  certain  conditions. 
The  starting  point  of  our  considerations  is  a  sequence  of  coefficients  (ak)k>  l  of  real 
numbers.  The  nth  partial  sum  is  defined  as 

n 

Sn  =  y  '  ak  =  a\  +  <22  +  •  •  •  +  an , 
k= l 


thus 


‘S'l  = 

^2  =  Cl\  +  ^2, 

iS3  =  a\  +  <22  +  <23,  etc. 

As  needed  we  also  use  the  notation  Sn  =  J2k= 0  ak  without  further  comment  if  the 
sequence  ao,  a\ ,  <22,  <23,  ...  starts  with  the  index  k  =  0. 

Definition  5.16  The  sequence  of  the  partial  sums  (Sn)n>  1  is  called  a  series.  If  the 
limit  S  =  lim^oo  Sn  exists,  then  the  series  is  called  convergent ,  otherwise  divergent. 

In  the  case  of  convergence  one  writes 


In  this  way  the  summation  problem  is  reduced  to  the  question  of  convergence  of  the 
sequence  of  the  partial  sums. 
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Experiment  5.17  The  M-file  mat  0 5_3  .  m,  when  called  as  ma 1 0 5_3  (N,  Z ) ,  gen¬ 
erates  the  first  N  partial  sums  with  time  delay  Z  [seconds]  of  five  series,  i.e.  it 
computes  Sn  for  1  <  n  <  N  in  each  case: 


Series  1  : 


Series  3  : 


Series  5  : 


n 

Series  2  :  Sn  =  k~  ] 

k= l 
n 

Series  4  :  Sn  =  k~2 

k= l 


Experiment  with  increasing  values  of  N  and  try  to  see  which  series  shows  conver¬ 
gence  or  divergence. 

In  the  experiment  the  convergence  of  Series  5  seems  obvious,  while  the  observa¬ 
tions  for  the  other  series  are  rather  not  as  conclusive.  Actually,  Series  1  and  2  are 
divergent  while  the  others  are  convergent.  This  shows  the  need  for  analytical  tools 
in  order  to  be  able  to  decide  the  question  of  convergence.  However,  we  first  look  at 
a  few  examples. 

Example  5.18  (Geometric  series)  In  this  example  we  are  concerned  with  the  series 
J2kLo  ^  whh  rea^  factor  #  £  For  the  partial  sums  we  deduce  that 


Sn  =  ECtk  = 

k= 0 


1  -  qnJri 
1  -q 


Indeed,  by  subtraction  of  the  two  lines 

Sn  =  1  T  q  T  q2  +  •  •  •  +  qn  , 
qSn  =  q  +  q2  +  q3  +  -  -  +  qn+X 


one  obtains  the  formula  (1  —  q)Sn  =  1  —  qnJrl  from  which  the  result  follows. 
The  case  \q\  <  1:  As  qn+l  ->  0  the  series  converges  with  value 


t.  1  —  q 
S  =  1 1  m  - 

n^o o  1  —  q 


n+1 


1 


1  -q 


The  case  \q\  >  1:  For  q  >  1  the  partial  sum  Sn  =  (qn+l  —  1  )/{q  —  1)  — >  oo  and 
the  series  diverges.  In  the  case  of  q  <  —  1  the  partial  sums  Sn  =  (1  —  (— 1)/7+1  \q\n+i)/ 
(1  —  q)  are  unbounded  and  oscillate.  They  thus  diverge  as  well. 
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The  case  \q\  =  1:  For  q  =  1  we  have  5w  =  l  +  l  +  --  -  +  l=  n  +  l  which  tends 
to  infinity;  for  q  =  —1,  the  partial  sums  Sn  oscillate  between  1  and  0.  In  both  cases 
the  series  diverges. 


Example  5.19  The  nth  partial  sum  of  the  series  YlkLi  k(k+\  ) 


It  is  called  a  telescopic  sum.  The  series  converges  to 


oo 


5  =  E 

k= 1 


1 

k(k  +  1) 


lim 

n^oo 


Example  5.20  (Harmonic  series)  We  consider  the  series  YlkL i  p  By  combining 
blocks  of  two,  four,  eight,  sixteen,  etc.,  elements,  one  obtains  the  grouping 


1+  \  +  (j  +  ?)  +  (5  +  5  +  7  +  s)  +  (5  H  h  ife)  +  (r7  h  )  ■* 

>1  +  l  +  (p3)  +  (g  +  g  +  |  +  g)  +  (^H - +  1^)  +  (P - )  + 


-I+J+2+2+5+5+ 


OO. 


The  partial  sums  tend  to  infinity,  therefore,  the  series  diverges. 


There  are  a  number  of  criteria  which  allow  one  to  decide  whether  a  series  converges 
or  diverges.  Here  we  only  discuss  two  simple  comparison  criteria,  which  suffice  for 
our  purpose.  For  further  considerations  we  refer  to  the  literature,  for  instance  [3, 
Chap.  9.2]. 


Proposition  5.21  (Comparison  criteria)  Let  0  <  ak  <  bk  for  all  k  e  N  or  at  least 
for  all  k  greater  than  or  equal  to  a  certain  ko.  Then  we  have: 

(a)  If  the  series  bk  is  convergent  then  the  series  ak  converges,  too. 

(b)  If  the  series  ak  Is  divergent  then  the  series  bk  diverges,  too. 

Proof  (a)  The  partial  sums  fulfill  Sn  =  Jfk= 1  ak  —  J2kL\  bk  =  T  and  Sn  <  Sn+\ , 
hence  are  bounded  and  monotonically  increasing.  According  to  Proposition  5. 10  the 
limit  of  the  partial  sums  exists. 

(b)  This  time,  we  have  for  the  partial  sums 

n  n 

Tn  =  ^  '  bk  2  ^  '  &k  ^  00  ? 

k= 1  k= 1 


since  the  latter  are  positive  and  divergent. 


□ 
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Under  the  condition  0  <  ak  <  bk  of  the  proposition  one  says  that  J2kLi  b k  dom¬ 
inates  YaLi  ak •  A  series  thus  converges  if  it  is  dominated  by  a  convergent  series;  it 
diverges  if  it  dominates  a  divergent  series. 


Example  5.22  The  series 


k 2 


is  convergent.  For  the  proof  we  use  that 


n 


E 


1 

k 2 


n—  1 


1 

(7  +  D2 


and 


1 

~~  a  +  D2 


< 


1 

Ju  + 1) 


Example 5. 19  shows  that  bj  converges.  Proposition 5.21  then  implies  conver¬ 
gence  of  the  original  series. 


Example  5.23  The  series  Yh=i  k  °'"  diverges.  This  follows  from  the  fact  that 
k~l  <  k~0-99.  Therefore,  the  series  YkL\  k  °'99  dominates  the  harmonic  series 
which  itself  is  divergent,  see  Example  5.20. 


Example  5.24  In  Chap.  2  Euler’s  number 


^  1  111 

—  /  TT  —  1  f  1  f  “  f  7  +  T“T 

^  /!  2  6  24 

7  =0  ^ 


+ 


1 

- T  •  •  • 

120 


was  introduced.  We  can  now  show  that  this  definition  makes  sense,  i.e.  the  series 
converges.  For  j  >  4  it  is  obvious  that 

j\  =  1  •  2  •  3  •  4  •  5 . 7  >2- 2- 2- 2- 2 . 2  =  2E 

Thus  the  geometric  series  is  a  dominating  convergent  series. 

Example  5.25  The  decimal  notation  of  a  positive  real  number 

a  =  A. cx\cx 2^3  •  •  • 

with  A  e  No,  ak  e  {0, . . . ,  9}  can  be  understood  as  a  representation  by  the  series 

oo 

a  =  A  +  akl0~k. 
k=  l 


The  series  converges  since  A  +  9  k  is  a  dominating  convergent  series. 
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5.4  Supplement:  Accumulation  Points  of  Sequences 

Occasionally  we  need  sequences  which  themselves  do  not  converge  but  have  con¬ 
vergent  subsequences.  The  notions  of  accumulation  points ,  limit  superior  and  limit 
inferior  are  connected  with  this  concept. 

Definition  5.26  A  number  b  is  called  accumulation  point  of  a  sequence  ( an)n>\  if 
each  neighbourhood  U£(b)  of  b  contains  infinitely  many  terms  of  the  sequence: 

Ms  >  0  Mn  G  N  3m  =  m(n,  s)  >  n  :  | b  —  am\  <  e. 


Figure 5.3  displays  the  sequence 

1 

an  =  arctan n  +  cos(^7t/2)  H —  sin(ft7r/2). 

n 

It  has  three  accumulation  points,  namely  b\  =  7t/2  +  1  ^  2.57,  Z?2  =  tt/ 2  ~  1.57 
and  =  7t/2  —  1  ~  0.57. 

If  a  sequence  is  convergent  with  limit  a  then  a  is  the  unique  accumulation  point. 
Accumulation  points  of  a  sequence  can  also  be  characterised  with  the  help  of  the 
concept  of  subsequences. 

Definition  5.27  If  1  <n\  <  n2  <  n?>  <  •••  is  a  strictly  monotonically  increasing 
sequence  of  integers  (indices)  then 


iP'Vlj  )  j>\ 

is  called  a  subsequence  of  the  sequence  (an)n> 


3 
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0 
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Fig.  5.3  Accumulation  points  of  a  sequence 
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1  O 

Example  5.28  We  start  with  the  sequence  an  =  j-.  If  we  take,  for  instance,  rij  =j 
then  we  obtain  the  sequence  anj  =  as  subsequence: 

(n  \  N 

yPn  )n>  1  v  ^  >  2’  3’  4  ’  5’  6’  7  ’  8’  9  ’  10’  ’  ’  ’ '  ’ 

iP'rij  )  /  >  1  =  ( 1  j  4  5  9  ?  •  •  •  )  • 

Proposition  5.29  A  number  b  is  an  accumulation  point  of  the  sequence  (an)n> o  if 
and  only  ifb  is  the  limit  of  a  convergent  subsequence  (anj)j>\. 

Proof  Let  b  be  an  accumulation  point  of  the  sequence  (an)n> o.  Step  by  step  we  will 
construct  a  strictly  monotonically  increasing  sequence  of  indices  (nj)  j>\  so  that 

1 

\b-an.  \  <  — 

J 

is  fulfilled  for  all  j  e  N.  According  to  Definition 5.26  for  e\  =  1  we  have 

Vn  G  N  3m  >  n  :  |  b-  am  I  <  ei- 

We  choose  n  =  1  and  denote  the  smallest  m  >  n  which  fulfills  this  condition  by  n  \ . 
Thus 

I b  -ani  \  <e i  =  1. 

For  £2  =  \  one  again  obtains  according  to  Definition 5.26: 

Wi  g  N  3m  >  n  :  |fe-  I  <  £2* 

This  time  we  choose  n  =  n  i  +  1  and  denote  the  smallest  m  >  n  i  +  1  which  fulfills 
this  condition  by  ^2-  Thus 

1 

I b  -a„2 1  <  e2  = 

It  is  clear  how  one  has  to  proceed.  Once  rij  is  constructed  one  sets  £j+ 1  —  VO'  +  V 
and  uses  Definition 5.26  according  to  which 

Vn  g  N  3m  >  /i  :  |i-  I  <  £j+l. 

We  choose  n  =  n  j  +  1  and  denote  the  smallest  m  >  fij  +  1  which  fulfills  this  con¬ 
dition  by  nj+\.  Thus 

1 

7TT* 


1^  drij+x  I  <  ^  7  +  1  — 
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This  procedure  guarantees  on  the  one  hand  that  the  sequence  of  indices  (nj)j>\  is 
strictly  monotonically  increasing  and  on  the  other  hand  that  the  desired  inequality  is 
fulfilled  for  all  j  e  N.  In  particular,  (an.)  ;>i  is  a  subsequence  that  converges  to  b. 

Conversely,  it  is  obvious  that  the  limit  of  a  convergent  subsequence  is  an  accu¬ 
mulation  point  of  the  original  sequence.  □ 

In  the  proof  of  the  proposition  we  have  used  the  method  of  recursive  definition  of 
a  sequence,  namely  the  subsequence  (an.)j> 

We  next  want  to  show  that  each  bounded  sequence  has  at  least  one  accumulation 
point — or  equivalently — a  convergent  subsequence.  This  result  bears  the  names  of 
Bolzano  and  Weiers trass  and  is  an  important  technical  tool  for  proofs  in  many 
areas  of  analysis. 

Proposition  5.30  (Theorem  of  Bolzano- Weierstrass)  Every  bounded  sequence 
(an)n>i  has  (at  least)  one  accumulation  point. 

Proof  Due  to  the  boundedness  of  the  sequence  there  are  bounds  b  <  c  so  that  all 
terms  of  the  sequence  an  lie  between  b  and  c.  We  bisect  the  interval  [b,  c].  Then  in 
at  least  one  of  the  two  half-intervals  [b,  ( b  +  c)/ 2]  or  [(b  +  c)/ 2,  c]  there  have  to 
be  infinitely  many  terms  of  the  sequence.  We  choose  such  a  half-interval  and  call 
it  [b i,  ci].  This  interval  is  also  bisected;  in  one  of  the  two  halves  again  there  have 
to  be  infinitely  many  terms  of  the  sequence.  We  call  this  quarter-interval  [ Z?2 ,  cf\. 
Continuing  this  way  we  obtain  a  sequence  of  intervals  [ bn ,  cn  ]  of  length  2 ~7?(c  —  b) 
each  of  which  contains  infinitely  many  terms  of  the  sequence.  Obviously  the  bn  are 
monotonically  increasing  and  bounded,  therefore  converge  to  a  limit  b.  Since  each 
interval  [b  —  2~n ,  b  +  2~n ]  by  construction  contains  infinitely  many  terms  of  the 
sequence,  b  is  an  accumulation  point  of  the  sequence.  □ 

If  the  sequence  (< an)n>\  is  bounded  then  the  set  of  its  accumulation  points  is 
also  bounded  and  hence  has  a  supremum.  This  supremum  is  itself  an  accumulation 
point  of  the  sequence  (which  can  be  shown  by  constructing  a  suitable  convergent 
subsequence)  and  thus  forms  the  largest  accumulation  point. 

Definition  5.31  The  largest  accumulation  point  of  a  bounded  sequence  is  called 
limit  superior  and  is  denoted  by  lim^oo^  or  lim  sup^^  an.  The  smallest  accu¬ 
mulation  point  is  called  limit  inferior  with  the  corresponding  notation  lim  n^fy2an 
or  lim  inf  oo  an. 


2B.  Bolzano,  1781-1848. 

3K.  Weierstrass,  1815-1897. 
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The  relationships 


lim  sup  an 

n^oo 


lim  (  sup  am  ) , 

n^oo  V  m>n  ) 


lim  inf  an  = 

n—>oo 


follow  easily  from  the  definition  and  justify  the  notation. 

For  example,  the  sequence  (an)n>\  from  Fig.  5.3  has  lim  sup^^  an  =  7t/2  +  1 
and  lim  inf an  =  tt/2  —  1 . 


5.5  Exercises 


1.  Find  a  law  of  formation  for  the  sequences  below  and  check  for  monotonicity, 
boundedness  and  convergence: 


1  3 
4’  9’ 


_5_  1_ 

16’  25’ 


_9_ 

36’  " 
J_ 

16’  " 


5 


2. 


Verify  that  the  sequence  an  =  converges  to  1 . 
Hint.  Given  e  >  0,  find  n(s)  such  that 


1  +  n2 


<  £ 


for  all  n  >n(e). 

3.  Determine  a  recursion  formula  that  provides  the  terms  of  the  geometric  sequence 
an  =  qn,  n  >  0  successively.  Write  a  MATLAB  program  that  calculates  the  first 
N  terms  of  the  geometric  sequence  for  an  arbitrary  q  e  R. 

Check  the  convergence  behaviour  for  different  values  of  q  and  plot  the  results. 
Do  the  same  with  the  help  of  the  applet  Sequences. 

4.  Investigate  whether  the  following  sequences  converge  and,  in  case  of  conver¬ 
gence,  compute  the  limit: 


n 


n  +  1 


CLn  — 


dn  —  n  — 


n  +  1  n 
n^  T  3 n  T  1 


1 

bn  —  H — 5 

n 


cn=  I  -- 


n 


n 


1 


n 


n 


=  2  («■+<=-)• 


fn  =  COS(f27T), 


5.  Investigate  whether  the  following  sequences  have  a  limit  or  an  accumulation 
point.  Compute,  if  existent,  lim,  lim  inf,  lim  sup,  inf,  sup: 


CLn 


n  +  7 

ft3  +  n  +  1  ’ 


bn 


1  -3  n2 

Tn  +  5  ’ 

1  +  (-!)" 


Cn 


Qn  —  e 
Qn  +  e 


— n 


— n 


9 


fn  =  (l  +  (—!)")(— 1)"/2. 


dn  =  !  +  (-!)", 


n 
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6.  Open  the  applet  Sequences ,  visualise  the  sequences  from  Exercises  4  and  5  and 
discuss  their  behaviour  by  means  of  their  graphs. 

7.  The  population  model  of  Verhulst  from  Example  5.3  can  be  described  in  appro¬ 
priate  units  in  simplified  form  by  the  recursive  relationship 

%n-\- 1  =  t'Xni.  1  V/2 ) ,  n  =  0,  1,2,  3,... 


with  an  initial  value  vo  and  a  parameter  r.  We  presume  in  this  sequence  that 
0  <  *o  <  1  and  0  <  r  <  4  (since  all  xn  then  stay  in  the  interval  [0,  1]).  Write 
a  MATLAB  program  which  calculates  for  given  r,  x o,  N  the  first  N  terms  of  the 
sequence  (. xn)n>\ .  With  the  help  of  your  program  (and  some  numerical  values 
for  r,  vo,  N)  check  the  following  statements: 

(a)  For  0  <  r  <  1  the  sequence  xn  converges  to  0. 

(b)  For  1  <  r  <  2\fl  the  sequence  xn  tends  to  a  positive  limit. 

(c)  For  3  <  r  <  1  +  \/6  the  sequence  xn  eventually  oscillates  between  two 
different  positive  values. 

(d)  For  3.75  <  r  <  4  the  sequence  xn  behaves  chaotically. 

Illustrate  these  assertions  also  with  the  applet  Sequences. 

8.  The  sequence  (< an)n>\  is  given  recursively  by 


_  1  2 
an+ 1  —  ^  an 


1 

2  ' 


Which  starting  values  A  e  R  are  fixed  points  of  the  recursion,  i.e.  it  holds 
A  =  a\  —  a2  =  . . .?  Investigate  for  which  starting  values  AeM  the  sequence 
converges  or  diverges,  respectively.  You  can  use  the  applet  Sequences  for  that. 
Try  to  locate  the  regions  of  convergence  and  divergence  as  precisely  as  possible. 

9.  Write  a  MATLAB  program  which,  for  given  a  e  [0,  1]  and  N  eN,  calculates  the 
first  N  terms  of  the  sequence 


xn  =  na  —  \na\ ,  n  =  1,  2,  3,  . . . ,  N 


( In  a]  denotes  the  largest  integer  smaller  than  na).  With  the  help  of  your  pro¬ 
gram,  investigate  the  behaviour  of  the  sequence  for  a  rational  a  =  ^  and  for  an 
irrational  a  (or  at  least  a  very  precise  rational  approximation  to  an  irrational  a) 
by  plotting  the  terms  of  the  sequence  and  by  visualising  their  distribution  in  a 
histogram.  Use  the  MATLAB  commands  floor  and  hist. 

10.  Give  formal  proofs  for  the  remaining  rules  of  calculation  of  Proposition  5.7,  i.e. 
for  addition  and  division  by  modifying  the  proof  for  the  multiplication  rule. 

11.  Check  the  following  series  for  convergence  with  the  help  of  the  comparison 
criteria: 


oo 


E 


1 

k(k  +  2)  ’ 


oo 


E 


1 

C' 
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12.  Check  the  following  series  for  convergence: 


oo 


E 


2  +  k1 


13.  Try  to  find  out  how  the  partial  sums  Sn  of  the  series  in  Exercises  11  and  12  can 
be  calculated  with  the  help  of  a  recursion  and  then  study  their  behaviour  with 
the  applet  Sequences. 

14.  Prove  the  convergence  of  the  series 


Hint.  Use  the  fact  that  j\  >  4J  is  fulfilled  for  j  >9  (why)?  From  this  it  follows 
that  V  / j !  <  1  /2E  Now  apply  the  appropriate  comparison  criterion. 

15.  Prove  the  ratio  test  for  series  with  positive  terms  ak  >  0:  If  there  exists  a  number 
q,0  <  q  <  1  such  that  the  quotients  satisfy 

ak+i 

-  <  q 

ak 

for  all  k  6  No,  then  the  series  ak  converges. 

Hint.  From  the  assumption  it  follows  that  <21  <  aoq,  02  <  a\q  <  aoq 2  and  thus 
successively  ak  <  aoqk  for  all  k.  Now  use  the  comparison  criteria  and  the  con¬ 
vergence  of  the  geometric  series  with  q  <  1 . 


® 
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Limits  and  Continuity  of  Functions 


In  this  section  we  extend  the  notion  of  the  limit  of  a  sequence  to  the  concept  of  the  limit 
of  a  function.  Hereby  we  obtain  a  tool  which  enables  us  to  investigate  the  behaviour 
of  graphs  of  functions  in  the  neighbourhood  of  chosen  points.  Moreover,  limits  of 
functions  form  the  basis  of  one  of  the  central  themes  in  mathematical  analysis,  namely 
differentiation  (Chap.  7).  In  order  to  derive  certain  differentiation  formulas  some 
elementary  limits  are  needed,  for  instance,  limits  of  trigonometric  functions.  The 
property  of  continuity  of  a  function  has  far-reaching  consequences  like,  for  instance, 
the  intermediate  value  theorem ,  according  to  which  a  continuous  function  which 
changes  its  sign  in  an  interval  has  a  zero.  Not  only  does  this  theorem  allow  one  to  show 
the  solvability  of  equations,  it  also  provides  numerical  procedures  to  approximate 
the  solutions.  Further  material  on  continuity  can  be  found  in  Appendix  C. 


6.1  The  Notion  of  Continuity 

We  start  with  the  investigation  of  the  behaviour  of  graphs  of  real  functions 

/  :  (a,  b)  M 

while  approaching  a  point  v  in  the  open  interval  ( a ,  h)  or  a  boundary  point  of  the 
closed  interval  [a,  b].  For  that  we  need  the  notion  of  a  zero  sequence ,  i.e.  a  sequence 
of  real  numbers  (hn)n>\  with  lim^oo  hn  =  0. 


©  Springer  Nature  Switzerland  AG  2018 

M.  Oberguggenberger  and  A.  Ostermann,  Analysis  for  Computer  Scientists , 
Undergraduate  Topics  in  Computer  Science, 
https://doi.org/10.1007/978-3-319-91155-7_6 
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Definition  6.1  (Limits  and  continuity) 

(a)  The  function  /  has  a  limit  M  at  a  point  v  e  (a,  b),  if 

lim  /(v  +  /tn)  =  M 

n—>oo 


for  all  zero  sequences  (/zn)n>i  with  hn  ^  0.  In  this  case  one  writes 


M  =  lim  f  (x  +  h)  —  lim  /(£) 

/z — >0 


or 


/(v  +  h)  —>  M  as  h  —>  0. 


(b)  The  function  /  has  a  right-hand  limit  R  at  the  point  v  e  [a,  b),  if 


lim  /(v  +  /tn)  =  7? 

n^oo 


for  all  zero  sequences  (hn)n>\  with  hn  >  0,  with  the  corresponding  notation 


R  =  lim  f(x  +  h)  =  lim  /(£), 

/z — ^0-1-  ^ — >x-|- 


(c)  The  function  /  has  a  left-hand  limit  L  at  the  point  v  e  (a,  b],  if: 


lim  f(x  +  hn)  ~  L 

n^oo 


for  all  zero  sequences  (hn)n>\  with  hn  <  0.  Notations: 


L  —  lim  f(x  +  h)  =  lim  /(£). 

/z  — >0 —  J  f 


x  — 


(d)  If  /  has  a  limit  M  at  v  e  (a ,  /?)  which  coincides  with  the  value  of  the  function, 
i.e.  f{x)  =  M,  then  /  is  called  continuous  at  the  point  x. 

(e)  If  /  is  continuous  at  every  v  e  (a,  b),  then  /  is  said  to  be  continuous  on  the  open 
interval  (a,  b).  If  in  addition  /  has  right-  and  left-hand  limits  at  the  endpoints  a 
and  b ,  it  is  called  continuous  on  the  closed  interval  [a,  b]. 
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Fig.  6.1  Limit  and  continuity;  left-  and  right-hand  limits 


Figure  6.1  illustrates  the  idea  of  approaching  a  point  v  for  h  ->  0  as  well  as 
possible  differences  between  left-hand  and  right-hand  limits  and  the  value  of  the 
function. 

If  a  function  /  is  continuous  at  a  point  x,  the  function  evaluation  can  be  inter¬ 
changed  with  the  limit: 


lim  /(0  =  fix)  =  /(lim  0. 


The  following  examples  show  some  further  possibilities  how  a  function  can  behave 
in  the  neighbourhood  of  a  point:  Jump  discontinuity  with  left-  and  right-hand  limits, 
vertical  asymptote,  oscillations  with  non-vanishing  amplitude  and  ever-increasing 
frequency. 

Example  6.2  The  quadratic  function  fix)  =  x2  is  continuous  at  every  xgR  since 
fix  +  hn)  -  f{x)  =  (x  +  hn )2  -  v2  =  2  xhn  +  ->  0 

as  ft  — >  oo  for  any  zero  sequence  (hn)n> Therefore 


lim  fix  +h)  =  fix), 
h  — ^  0 


Likewise  the  continuity  of  the  power  functions  x  form  G  N  can  be  shown. 

Example  6.3  The  absolute  value  function  fix)  =  \x\  and  the  third  root  gix)  =  %fx 
are  everywhere  continuous.  The  former  has  a  kink  at  x  =  0,  the  latter  a  vertical 
tangent;  see  Fig.  6.2. 

Example  6.4  The  sign  function  /(jc)  =  sign  x  has  different  left-  and  right-hand  lim¬ 
its  L  =  —  1 ,  R  =  1  at  v  =  0.  In  particular,  it  is  discontinuous  at  that  point.  At  all  other 
points  v  /  0  it  is  continuous;  see  Fig.  6.3. 
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Fig.  6.2  Continuity  and  kink  or  vertical  tangent 


0 


► 


x 


y  —  (signx)2 


Fig.  6.3  Discontinuities:  jump  discontinuity  and  exceptional  value 


Example  6.5  The  square  of  the  sign  function 


g(x)  =  (sign*)2 


1 .  x  ^0 

0,  .v  =  0 


has  equal  left-  and  right-hand  limits  at  v  =  0.  However,  they  are  different  from  the 
value  of  the  function  (see  Fig.  6.3): 


lim  g(0  =  1^0  =  0(0). 


Therefore,  g  is  discontinuous  at  v  =  0. 

Example  6.6  The  functions  f(x)  =  ^  and  g(x)  =  tan  v  have  vertical  asymptotes  at 
v  =  0  and  x  =  |  +  kir,  k  e  Z,  respectively,  and  in  particular  no  left-  or  right-hand 
limit  at  these  points.  At  all  other  points,  however,  they  are  continuous.  We  refer  to 
Figs.  2.9  and  3.10. 

Example  6.7  The  function  f(x)  =  sin  ^  has  no  left-  or  right-hand  limit  at  v  =  0 

A 

but  oscillates  with  non- vanishing  amplitude  (Fig.  6.4).  Indeed,  one  obtains  different 
limits  for  different  zero  sequences.  For  example,  for 


rnr 


1 

In  — 


7r/2  +  2mr  ’ 


3tt/2  +  2mr 
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Fig.  6.4  No  limits, 

oscillation  with 

non- vanishing  amplitude 


Fig.  6.5  Continuity, 
oscillation  with  vanishing 
amplitude 


the  respective  limits  are 

lim  f(hn)  —  0,  lim 

n — >  oo  n—>cc 


y  =  sin(l/x) 


-0.2  -0.1  0  0.1  0.2 

y  =  x  sin(l/x) 


-0.1  0  0.1 


n)  =  1,  lim  /(/„)  =  -1. 

n  —>  oo 


All  other  values  in  the  interval  [—1,  1]  can  also  be  obtained  as  limits  with  the  help 
of  suitable  zero  sequences. 

Example  6.8  The  function  g(x )  =  v  sin  ^  can  be  continuously  extended  by  g(0)  =  0 
at  v  =  0;  it  oscillates  with  vanishing  amplitude  (Fig.  6.5).  Indeed, 

I g(h„)  -  5(0)1  =  | h„  sin  -  0|  <  \hn  \  0 

n  n 

for  all  zero  sequences  (hn)n>i,  thus  lim^^o  h  sin  ^  =0. 

Experiment  6.9  Open  the  M-files  mat0  6_l  .m  and  mat0  6_2  .m,  and  study  the 
graphs  of  the  functions  in  Figs.  6.4  and  6.5  with  the  use  of  the  zoom  tool  in  the  figure 
window.  How  can  you  improve  the  accuracy  of  the  visualisation  in  the  neighbourhood 
of  x  =  0? 
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6.2  Trigonometric  Limits 

Comparing  the  areas  in  Fig.  6.6  shows  that  the  area  of  the  grey  triangle  with  sides 
cos  x  and  sin  v  is  smaller  than  the  area  of  the  sector  which  in  turn  is  smaller  or  equal 
to  the  area  of  the  big  triangle  with  sides  1  and  tan  v . 

The  area  of  a  sector  in  the  unit  circle  (with  angle  v  in  radian  measure)  equals  x/2 
as  is  well-known.  In  summary  we  obtain  the  inequalities 

1  x  l 

-  sin  x  cos  x  <  —  <  -  tan  v 

2  “  2  “  2 

or  after  division  by  sin  v  and  taking  the  reciprocal 

sin  v  1 

cos  v  <  -  <  - , 

v  cos  v 


valid  for  all  v  with  0  <  \x\  <  7t/2. 

With  the  help  of  these  inequalities  we  can  compute  several  important  limits.  From 
an  elementary  geometric  consideration,  one  obtains 

1  7T  7T 

Icosvl  >-  for - <  v  <  — , 

2  3  “  “  3 

and  together  with  the  previous  inequalities 

|sin/t„|  <  <2\hn  \  — >  0 

|cos  lin  | 


for  all  zero  sequences  (hn)n>\.  This  means  that 


lim  sink  =  0. 

h^O 


Fig.  6.6  Illustration  of  A 

trigonometric  inequalities 


COS  X 


1 


6.2  Trigonometric  Limits 
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The  sine  function  is  therefore  continuous  at  zero.  From  the  continuity  of  the  square 
function  and  the  root  function  as  well  as  the  fact  that  cos  h  equals  the  positive  square 
root  of  1  —  sin2  h  for  small  h  it  follows  that 


lim  cos  h  =  lim  yl  -  sin2  h  =  1 . 

/z — >0  h^O 


With  this  the  continuity  of  the  sine  function  at  every  point  xgR  can  be  proven: 
lim  sin (x  +  h)  =  lim  ( sin  v  cos  h  +  cos  v  sin  h)  =  sin  v . 

/z — >0  h^O  V  ’ 

The  inequality  illustrated  at  the  beginning  of  the  section  allows  one  to  deduce  one 
of  the  most  important  trigonometric  limits.  It  forms  the  basis  of  the  differentiation 
rules  for  trigonometric  functions. 

Proposition  6.10  lirn^o  =  1. 

JC 

Proof  We  combine  the  above  result  limx^o  cos  v  =  1  with  the  inequality  deduced 
earlier  and  obtain 

sin  v  1 

1  =  lim  cos  v  <  lim  -  <  lim  - =  1, 

x^O  x  — »•  0  X  x^OCOSX 

and  therefore  lirn^o  =  1.  □ 

A 


6.3  Zeros  of  Continuous  Functions 

Figure  6.7  shows  the  graph  of  a  function  that  is  continuous  on  a  closed  interval 
[a ,  b]  and  that  is  negative  at  the  left  endpoint  and  positive  at  the  right  endpoint. 
Geometrically  the  graph  has  to  intersect  the  v-axis  at  least  once  since  it  has  no  jumps 
due  to  the  continuity.  This  means  that  /  has  to  have  at  least  one  zero  in  (a,  b).  This 
is  a  criterion  that  guarantees  the  existences  of  a  solution  to  the  equation  f(x)  =  0. 
A  first  rigorous  proof  of  this  intuitively  evident  statement  goes  back  to  Bolzano. 

Proposition  6.11  (Intermediate  value  theorem)  Let  f  :  [a,  b]  R  be  continuous 
and  f(a)  <  0,  f(b)  >  0.  Then  there  exists  a  point  £  e  (a,  b)  with  /(£)  =  0. 

Proof  The  proof  is  based  on  the  successive  bisection  of  the  intervals  and  the  com¬ 
pleteness  of  the  set  of  real  numbers.  One  starts  with  the  interval  [a,  b]  and  sets  a\  =  a, 
b\  =  b. 
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Fig.  6.7  The  intermediate 
value  theorem 


Step  1:  Compute  yi  =  f  ^ai+bl 

If  y\  >  0  :  set  <22  =  a\ ,  b2  =  ai \bl  • 

If  y\  <  0  :  set  a2  =  ai+bl ,  b2  =  £>1 . 

If  3^1  =  0  :  termination,  £  =  a]+b]  is  a  zero. 

By  construction  f{a2)  <  0,  /(£2)  >  0  and  the  interval  length  is  halved: 

1 

b2~a2  =  ~(b  1  ai). 

Step  2:  Compute  y2  —  f 

If  y2  >  0  :  set  <23  =  a2,  £>3  =  . 

If  y2  <  0  :  set  <23  =  b2  =  b2. 

If  y2  =  0  :  termination,  £  =  is  a  zero. 

Further  iterations  lead  to  a  monotonically  increasing  sequence 


<  a2  <  a2  <  •  •  •  <  b 

which  is  bounded  from  above.  According  to  Proposition  5.10  the  limit  £  =  lim  an 

n^-oo 

exists. 

On  the  other  hand  \an  —  bn\  <  \a  —  b\/2n~{  ->  0,  therefore  lim^oo  bn  =  £  as 
well.  If  £  has  not  appeared  after  a  finite  number  of  steps  as  either  ak  or  bk  then  for 
all  n  e  N: 


f(an)  <  0, 

From  the  continuity  of  /  it  follows  that 

f(0  =  lim  f(an)  <  0, 

n^oo 

which  implies  /(£)  =  0,  as  claimed. 


/(*„)  >  0. 


no  =  lim  f(bn)  >  o 

n^oo 

□ 


6.3  Zeros  of  Continuous  Functions 
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The  proof  provides  at  the  same  time  a  numerical  method  to  compute  zeros  of 
functions,  the  bisection  method .  Although  it  converges  rather  slowly,  it  is  easily 
implementable  and  universally  applicable — also  for  non-differentiable,  continuous 
functions.  For  differentiable  functions,  however,  considerably  faster  algorithms  exist. 
The  order  of  convergence  and  the  discussion  of  faster  procedures  will  be  taken  up  in 
Sect.  8.2. 

Example  6.12  Calculation  of  \/2  as  the  root  of  f(x)  =  x2  —  2  =  0  in  the  interval 
[1,2]  using  the  bisection  method: 

Start:  /( 1)  =  -1  <  0,  /( 2)  =  2  >  0; 

Step  1:  /( 1.5)  =  0.25  >  0; 

Step  2:  /( 1.25)  =  -0.4375  <  0; 

Step  3:  /( 1.375)  =  -0.109375  <  0; 

Step  4:  /(1.4375)  =  0.066406 . . .  >  0; 

Step  5:  /(l. 40625)  =  -0.022461 . . .  <  0; 

etc. 

After  5  steps  the  first  decimal  place  is  ascertained: 

1.40625  <  72  <  1.4375 

Experiment  6.13  Sketch  the  graph  of  the  function  y  =  v3  +  3x2  —  2  on  the  interval 
[—3,  2],  and  try  to  first  estimate  graphically  one  of  the  roots  by  successive  bisec¬ 
tion.  Execute  the  interval  bisection  with  the  help  of  the  applet  Bisection  method. 
Assure  yourself  of  the  plausibility  of  the  intermediate  value  theorem  using  the  applet 
Animation  of  the  intermediate  value  theorem. 

As  an  important  application  of  the  intermediate  value  theorem  we  now  show  that 
images  of  intervals  under  continuous  functions  are  again  intervals.  For  the  different 
types  of  intervals  which  appear  in  the  following  proposition  we  refer  to  Sect.  1.2;  for 
the  notion  of  the  proper  range  to  Sect.  2.1. 

Proposition  6.14  Let  I  C  W  be  an  interval  (open,  half-open  or  closed,  bounded  or 
improper )  and  f  :  I  —>  M  a  continuous  function  with  proper  range  J  =  /(/).  Then 
J  is  also  an  interval. 

Proof  As  subsets  of  the  real  line,  intervals  are  characterised  by  the  following 
property:  With  any  two  points  all  intermediate  points  are  contained  in  it  as  well. 
Let  y\,  y2  £  /,  y i  <  y2,  and  rj  be  an  intermediate  point,  i.e.  y\  <  rj  <  y^.  Since 
/:/—>/  is  surjective  there  are  x\,X2  €  /  such  that  y\  =  f(x i)  and  y 2  =  fixf). 
We  consider  the  case  x\  <  JC2-  Since  f(x\)  —  rj  <  0  and  f(x 2)  —  77  >  0  it  follows 
from  the  intermediate  value  theorem  applied  on  the  interval  [x\ ,  X2]  that  there  exists 
a  point  £  e  (x\ ,  xf)  with  /(£)  —  77  =  0,  thus  /(£)  =  rj.  Hence  rj  is  attained  as  a  value 
of  the  function  and  therefore  lies  in  /  =  /(/).  □ 


a i  =  l,  b i=2 

a2  =  1,  b2  =  1.5 
<23  =  1.25,  £>3  =  1.5 
<24  =  1.375,  £>4  =  1.5 
a5  =  1.375,  b5  =  1.4375 
a6  =  1.40625,  b6  =  1.4375 
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Proposition  6.15  Let  I  =  [a,  b]  be  a  closed,  bounded  interval  and  f  :  I  — >  M  a 
continuous  function.  Then  the  proper  range  /  =  /(/)  A  also  a  closed,  bounded 
interval. 

Proof  According  to  Proposition  6.14  the  range  J  is  an  interval.  Let  d  be  the  least 
upper  bound  (possibly  d  =  oo).  We  take  a  sequence  of  values  yn  6  J  which  con¬ 
verges  to  d.  The  values  yn  are  function  values  of  certain  arguments  xn  e  I  =  [a,  b]. 
The  sequence  ( xn)n>\  is  bounded  and,  according  to  Proposition  5.30,  has  an  accu¬ 
mulation  point  vo,  a  <  vo  <  b.  Thus  a  subsequence  (xn . )  /  >i  exists  which  converges 
to  vo  (see  Sect.  5.4).  From  the  continuity  of  the  function  /  it  follows  that 

d  =  lim  yn  =  lim  /(*„,)  =  f(x o). 
j-*0 O  j~X> O 


This  shows  that  the  upper  endpoint  of  the  interval  /  is  finite  and  is  attained  as  function 
value.  The  same  argument  is  applied  to  the  lower  boundary  c;  the  range  /  is  therefore 
a  closed,  bounded  interval  [c,  d].  □ 

From  the  proof  of  the  proposition  it  is  clear  that  d  is  the  largest  and  c  the  smallest 
value  of  the  function  /  on  the  interval  [a ,  b] .  We  thus  obtain  the  following  important 
consequence. 

Corollary  6.16  Each  continuous  function  defined  on  a  closed  interval  I  =  [a,  b] 
attains  its  maximum  and  minimum  there. 


6.4  Exercises 

1.  (a)  Investigate  the  behaviour  of  the  functions 

v+v2  Vl  +  v  —  1  v2  +  sinv 
I  v  |  v  Vl  —  cos2  v 

in  a  neighbourhood  of  v  =  0  by  plotting  their  graphs  for  arguments  in 
2’  —  Ido)  U  (ifio ’  2]- 

(b)  Find  out  by  inspection  of  the  graphs  whether  there  are  left-  or  right-hand 
limits  at  v  =  0.  Which  value  do  they  have?  Explain  your  results  by  rear¬ 
ranging  the  expressions  in  (a). 

Hint.  Some  guidance  for  part  (a)  can  be  found  in  the  M-file  mat  0  6_exl .  m. 
Expand  the  middle  term  in  (b)  with  Vl  +  v  +  1 . 
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2. 


3. 


Do  the  following  functions  have  a  limit  at  the  given  points?  If  so,  what  is  its 
value? 

(a)  y  =  x3  +  5x  +  10,  x  =  l. 


(b)  y  = 

(c)  y  = 


X 


1 


x2+x’ 
1— cosx 

X2 


x  =  0,  X  =  1,  X  =  —  1. 
,  V  =  0. 


Hint.  Expand  with  (1  +  cos  x). 

(d)  y  =  signv  •  sinv,  x  =  0. 

(e)  y  =  signv  •  cosv,  x  =  0. 

Let  fn(x)  =  arctan/iv,  gn(x)  =  (1  +  x2)~n .  Compute  the  limits 


fix)  =  lim  fn(x),  g(x)  =  lim  gn(x) 

n^oo  n^oo 

for  each  xgR,  and  sketch  the  graphs  of  the  thereby  defined  functions  /  and  g. 
Are  they  continuous?  Plot  fn  and  gn  using  MATLAB,  and  investigate  the  behaviour 
of  the  graphs  for  n  ->  oo. 

Hint.  An  advice  can  be  found  in  the  M-file  mat  0  6_ex3  .  m. 

4.  With  the  help  of  zero  sequences,  carry  out  a  formal  proof  of  the  fact  that  the 
absolute  value  function  and  the  third  root  function  of  Example  6.3  are  continuous. 

5.  Argue  with  the  help  of  the  intermediate  value  theorem  that  p{x)  =  v3  +  5v  +  10 
has  a  zero  in  the  interval  [—2,  1].  Compute  this  zero  up  to  four  decimal  places 
using  the  applet  Bisection  method. 

6.  Compute  all  zeros  of  the  following  functions  in  the  given  interval  with  accuracy 
10-3,  using  the  applet  Bisection  method. 

fix)  =  x4  —  2,  I  =  R; 
g(x)  =  x  —  cos  v,  I  =  R; 

h(x)  =  sin±,  /  =  [m’1T)]- 

7.  Write  a  MATLAB  program  which  locates — with  the  help  of  the  bisection  method — 
the  zero  of  an  arbitrary  polynomial 

a  o 

p(x)  =  X  +  C\X  +  C2X  +  C3 


of  degree  three.  Your  program  should  automatically  provide  starting  values  a,  b 
with  p(a)  <  0,  p(b)  >  0  (why  do  such  values  always  exist?).  Test  your  program 
by  choosing  the  coefficient  vector  (c\,  C2,  C3)  randomly,  for  example  by  using 
c  =  10  0  0*rand ( 1 , 3 )  . 

Hint.  A  solution  is  suggested  in  the  M-file  mat06_ex7  .  m. 


® 

Check  for 
updates 


The  Derivative  of  a  Function 


Starting  from  the  problem  to  define  the  tangent  to  the  graph  of  a  function,  we  introduce 
the  derivative  of  a  function.  Two  points  on  the  graph  can  always  be  joined  by  a  secant, 
which  is  a  good  model  for  the  tangent  whenever  these  points  are  close  to  each  other.  In 
a  limiting  process,  the  secant  (discrete  model)  is  replaced  by  the  tangent  (continuous 
model).  Differential  calculus,  which  is  based  on  this  limiting  process,  has  become 
one  of  the  most  important  building  blocks  of  mathematical  modelling. 

In  this  section  we  discuss  the  derivative  of  important  elementary  functions  as 
well  as  general  differentiation  rules.  Thanks  to  the  meticulous  implementation  of 
these  rules,  expert  systems  such  as  maple  have  become  helpful  tools  in  mathemat¬ 
ical  analysis.  Furthermore,  we  will  discuss  the  interpretation  of  the  derivative  as 
linear  approximation  and  as  rate  of  change.  These  interpretations  form  the  basis  of 
numerous  applications  in  science  and  engineering. 

The  concept  of  the  numerical  derivative  follows  the  opposite  direction.  The  contin¬ 
uous  model  is  discretised,  and  the  derivative  is  replaced  by  a  difference  quotient.  We 
carry  out  a  detailed  error  analysis  which  allows  us  to  find  an  optimal  approximation. 
Further,  we  will  illustrate  the  relevance  of  symmetry  in  numerical  procedures. 


7.1  Motivation 

Example  7. 1  (The  free  fall  according  to  Galilei  )  Imagine  an  object,  which  released 
at  time  t  =  0,  falls  down  under  the  influence  of  gravity.  We  are  interested  in  the 
position  s(t )  of  the  object  at  time  t  >  0  as  well  as  in  its  velocity  v(t),  see  Fig.  7.1. 
Due  to  the  definition  of  velocity  as  change  in  travelled  distance  divided  by  change 


lG.  Galilei,  1564-1642. 
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Fig.  7.1  The  free  fall 


°  s{t) 

T  S 


in  time,  the  object  has  the  average  velocity 


v 


s(t  +  At) 


average 


At 


in  the  time  interval  [t ,  t  +  At].  In  order  to  obtain  the  instantaneous  velocity  v  =  v(t ) 
we  take  the  limit  At  ->  0  in  the  above  formula  and  arrive  at 


s(t  +  At) 

lim  - 

At^O  At 


Galilei  discovered  through  his  experiments  that  the  travelled  distance  in  free  fall 
increases  quadratically  with  the  time  passed,  i.e.  the  law 

s(t)  =  - 12 
2 

with  g  ~  9.81  m/s2  holds.  Thus  we  obtain  the  expression 

Ut  +  At)2-h2  g 

v(t)  =  lim  - - - —  =  -  lim  (2 1  +  At)  =  gt 

At^O  At  2  At^ 0V  J  5 

for  the  instantaneous  velocity.  The  velocity  is  hence  proportional  to  the  time  passed. 


Example  7.2  (The  tangent  problem)  Consider  a  real  function  /  and  two  differ¬ 
ent  points  P  =  (vo,  fix o))  and  Q  =  (x,  f(x))  on  the  graph  of  the  function.  The 
uniquely  defined  straight  line  through  these  two  points  is  called  secant  of  the  func¬ 
tion  /  through  P  and  Q ,  see  Fig.  7.2.  The  slope  of  the  secant  is  given  by  the  difference 
quotient 

Ay  _  fix)  -  fix o) 

Ax  x  —  vo 

As  v  tends  to  vo,  the  secant  graphically  turns  into  the  tangent,  provided  the  limit 
exists.  Motivated  by  this  idea  we  define  the  slope 

,  fix)  -  fix o)  fix o  +  h)  -  f(xo) 

k  =  lim  -  =  lim  - 

x  —  xo  /i  — >  o  h 

of  the  function  /  at  xo.  If  this  limit  exists,  we  call  the  straight  line 

y  =  k-  (X  -  X o)  +  fix o) 

the  tangent  to  the  graph  of  the  function  at  the  point  (vq,  fix  o))« 


7.1  Motivation 
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Fig.  7.2  Slope  of  the  secant 


Experiment  7.3  Plot  the  function  f(x)  =  x 2  on  the  interval  [0,  2]  in  MATLAB .  Draw 
the  straight  lines  through  the  points  (1,  1),  (2,  z)  for  various  values  of  z.  Adjust  z 
until  you  find  the  tangent  to  the  graph  of  the  function  /  at  (1,  1)  and  read  off  its 
slope. 


7.2  The  Derivative 

Motivated  by  the  above  applications  we  are  going  to  define  the  derivative  of  a  real¬ 
valued  function. 


Definition  7.4  (Derivative)  Let  I  c  R  be  an  open  interval,  /:/—>►  R  a  real¬ 
valued  function  and  xq  e  I. 


(a)  The  function  /  is  called  differentiable  at  vo  if  the  difference  quotient 

Ay  f(x)  -  f(xp) 

Ax  x  —  vo 

has  a  (finite)  limit  for  v  — >  xq.  In  this  case  one  writes 


f'(xo) 


lim 


fix)  -  fix q) 

X  —  XQ 


lim 

/z  — >  0 


fix  0  +  h) 

h 


fix  o) 


and  calls  the  limit  derivative  of  f  at  the  point  xq. 


(b)  The  function  /  is  called  differentiable  (in  the  interval  I)  if  f'(x)  exists  for  all 
x  g  /.In  this  case  the  function 


f'  :  /  — R  :  v  i->  f'(x) 

is  called  the  derivative  of  /.  The  process  of  computing  f!  from  /  is  called 
differentiation. 


d  /*  cj 

In  place  of  f'(x)  one  often  writes  —  (*)  or  — fix),  respectively.  The  following 

dx  dx 

examples  show  how  the  derivative  of  a  function  is  obtained  by  means  of  the  limiting 
process  above. 
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Example  7.5  (The  constant  function  fix)  =  c ) 

,  f{x  +  h)  —  f(x)  c  —  c  0 

f  ix)  =  lim  - =  lim  - =  lim  —  =  0. 

0  h  h^O  h  /z  — > 0  h 

The  derivative  of  a  constant  function  is  zero. 


Example  7.6  (The  affine  function  gix)  =  ax  -\-b) 

,  qix  +  h)  —  qix)  ax  +  ah  +  b  —  ax  —  b 

gfx)  =  lim  — - - — =  lim  - =  lim  a  =  a, 

h  — y  0  h  h  — ^0  h  /z — >0 

The  derivative  is  the  slope  a  of  the  straight  line  y  =  ax  -\-b. 

Example  7.7  (The  derivative  of  the  quadratic  function  y  =  x2) 

.  ix  +  h)2  —  x2  2  hx  +  h 2 

y  =  lim  - =  lim  - =  lim  (2x  +  h)  =  2x. 

/z — >  0  h  /z  — >0  h  /z  — >0 

Similarly,  one  can  show  for  the  power  function  (with  n  e  N): 


fix)  =  x 


n 


fix)  =  n-x 


n  —  1 


Example  7.8  (The  derivative  of  the  square  root  function  y  =  +Jx  for  v  >  0) 


y  =  lim  — - =  lim 


=  lim 


1 


1 


Z-+X  £  —  X  (V?  —  y/x)(V€  +  *fx)  +  yr  2 *Jx 


Example  7.9  (Derivatives  of  the  sine  and  cosine  functions)  We  first  recall  from 
Proposition  6. 10  that 

sinf 

lim - =  1. 


r^0  t 


Due  to 


(cos  t  —  l)(cos  t  +  1)  =  —  sin2  t 


it  also  holds  that 


cos  t  —  1  sin  t 

- =  —  sin  t  • 


1 


t 


0 


t  cos  t  +  1 


0  for  t  0, 


1 


1/2 


and  thus 


cos  t  —  1 


7.2  The  Derivative 
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Due  to  the  addition  theorems  (Proposition  3.3)  we  get  with  the  preparations  from 
above 


sin'x  = 


sin(v  +  h)  —  sin  v  sin  v  cos  h  +  cos  v  sin  h  —  sin  v 

lim  - =  lim  - 

h  —>  0  h  h  —>  o  h 

cos  h  —  1  sink 

lim  sinx  - - h  lim  cosv  • - 

h  —>  0  h  h  — ^  o  h 

cos  h  —  1  sin/i 

sin  x  •  lim - b  cos  x  •  lim  - 

h  —>  0  h  h—>()  h 

l  -J 


■V- 


=  0 


=  1 


=  COS  V, 


This  proves  the  formula  shTv  =  cosv.  Likewise  it  can  be  shown  that  cos7*  = 
—  sin  v. 

Example  7.10  (The  derivative  of  the  exponential  function  with  base  e)  Rearranging 
terms  in  the  series  expansion  of  the  exponential  function  (Proposition  C.  12)  we  obtain 


zh  —  1  _  ~  hk  h  h 2  h 3 

h  ~  L  (fe+  !)i  -  +2  +  ~6+24  + 

k= 0  v  7 


From  that  one  infers 


e/z  —  1 
h 


-  1 


.  1  W  \h\3 

<  h  -  +  V  +  bb  + 

2  6  24 


Letting  h  ->  0  hence  gives  the  important  limit 

th  —  1 

lim  - =  1 

/z — >0  h 

The  existence  of  the  limit 


<  \h\e 


\h\ 


Qx+h  _  Qx 

lim  - 

0  h 


=  e  •  lim 


eh  -  1 


h->0  h 


=  e 


shows  that  the  exponential  function  is  differentiable  and  that  (e*)'  =  e1 


Example  7.11  (New  representation  of  Euler’s  number)  By  substituting  y  =  th  —  1, 
h  =  log(y  +  1)  in  the  above  limit  one  obtains 


y 


lim  — 

log(y  +  1) 


=  1 


and  in  this  way 


r  i  n  ,  vi/y  r  l°g(l  +  ay)  v  log(l  +ay) 
lim  log(l  +  ay)  ' y  =  lim  - =  a  lim  - 

y->0  y^O  y  y^O  ay 


=  a. 
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Due  to  the  continuity  of  the  exponential  function  it  further  follows  that 

lim  (1  +  ay)l^y  =  ea. 

>’^0 

In  particular,  for  y  =  1/n,  we  obtain  a  new  representation  of  the  exponential  function 

Qa  =  lim  ( 1  H —  )  . 

n^o o  V  n  / 

For  a  =  1  the  identity 


/  l\n 
e  =  lim  I  1  H —  I 


OO  j 

V  -  =  2.718281828459. 
k= o 


follows. 


Example  7. 72  Not  every  continuous  function  is  differentiable.  For  instance,  the  func¬ 
tion 


fix)  =  \x 


x,  x  >  0 
— x,  x  <  0 


is  not  differentiable  at  the  vertex  v  =  0,  see  Fig.  7.3,  left  picture.  However,  it  is 
differentiable  for  v  /  0  with 


1 ,  if  v  >  0 
—  1,  if  v  <  0. 


The  function  g(x)  =  %fx  is  not  differentiable  at  v  =  0  either.  The  reason  for  that  is 
the  vertical  tangent,  see  Fig.  7.3,  right  picture. 


There  are  even  continuous  functions  that  are  nowhere  differentiable.  It  is  possible 
to  write  down  such  functions  in  the  form  of  certain  intricate  infinite  series.  However, 
an  analogous  example  of  a  (continuous)  curve  in  the  plane  which  is  nowhere  differ¬ 
entiable  is  the  boundary  of  Koch ’s  snowflake ,  which  can  be  constructed  in  a  simple 
geometric  manner,  see  Examples  9.9  and  14.17. 


Fig.  7.3  Functions  that  are 
not  differentiable  at  x  =  0 


y=yfx 


7.2  The  Derivative 
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Definition  7.13  If  the  function  fr  is  again  differentiable  then 


„  d2  d2/  f\x  +h)~  f(x) 

dx2H  )  dx2^  j  h™ 0  h 


/"(*)  = 


is  called  the  second  derivative  of  /  with  respect  to  x.  Likewise  higher  derivatives 
are  defined  recursively  as 


/'"(*)  =  (. f'\x ))'  or  -E -fix)  =  d  ■  etc- 

Differentiating  with  maple.  Using  maple  one  can  differentiate  expressions  as  well 
as  functions.  If  the  expression  g  is  of  the  form 

g  :=  x~2  -  a*x; 

then  the  corresponding  function  /  is  defined  by 

f  :=  x  ->  x~2  -  a*x; 


The  evaluation  of  functions  generates  expressions,  for  example  f  ( t )  produces  the 
expression  t 2  —  at.  Conversely,  expressions  can  be  converted  to  functions  using 
unapply 


h  :=  unapply (g , x) ; 

The  derivative  of  expressions  can  be  obtained  using  di  f  f ,  those  of  functions  using 
D.  Examples  can  be  found  in  the  maple  worksheet  mp07_l .  mws. 


7.3  Interpretations  of  the  Derivative 

We  introduced  the  derivative  geometrically  as  the  slope  of  the  tangent,  and  we  saw 
that  the  tangent  to  a  graph  of  a  differentiable  function  /  at  the  point  (To,  f{x o))  is 
given  by 

y  =  fix  o)(x  -  xo)  +  fix  o). 


Example  7.14  Let  f(x)=x4  +  1  with  derivative  f\x)  =  4x3 . 
(i)  The  tangent  to  the  graph  of  /  at  the  point  (0,  1)  is 

y  =  /'(O)  •  (x  -  0)  +  /( 0)  =  l 


and  thus  horizontal. 
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(ii)  The  tangent  to  the  graph  of  /  at  the  point  (1,  2)  is 

y  =  /'( 1)0  -  1)  +  2  =  4(jc  -  1)  +  2  =  4x  -  2. 

The  derivative  allows  further  interpretations. 

Interpretation  as  linear  approximation.  We  start  off  by  emphasising  that  every 
differentiable  function  /  can  be  written  in  the  form 

f(x)  =  fix  o)  +  f\x  o)(v  -  xo)  +  RQc,  xo), 
where  the  remainder  R(x,  xq)  has  the  property 


RQc,xo) 

lim  - =  0. 

x  —  Xo 


This  follows  immediately  from 

R(x,  xo)  =  fix)  -  fixo)  ~  fix  o)(x  -  xo) 
by  dividing  by  x  —  xq,  since 


fix)  -  fjxo) 
X  —  Xo 


as  v 


Vq  • 


Application  7.15  As  we  have  just  seen,  a  differentiable  function  /  is  characterised 
by  the  property  that 

fix)  =  fixo)  +  fix  0)(x  -  x0)  +  R(x,  x0), 


where  the  remainder  term  Rix,  vo)  tends  faster  to  zero  than  x  —  xo.  Taking  the 
limit  x  — >  vo  in  this  equation  shows  in  particular  that  every  differentiable  function 
is  continuous. 


Application  7.16  Let  g  be  the  function  given  by 

gix)  =  k  ■  (x  -  x0)  +  fix o). 

Its  graph  is  the  straight  line  with  slope  k  passing  through  the  point  (xq,  /(xq)).  Since 


f(x)-g(x)  fix)  -  fixo) -k-ix -xo)  ,  Rix,x0) 

- =  /  (*o)  ~k  + 


X  —  Xo 


X  —  Xo 


X  —  Xo 
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as  x  — >  xo,  the  tangent  with  k  =  f\x o)  is  the  straight  line  which  approximates  the 
graph  best.  One  therefore  calls 

g(x)  =  fix  o)  +  f  (xo)  •  (x  -  X0) 

the  linear  approximation  to  /  at  xo.  For  x  close  to  xo  one  can  consider  g(x)  as  a 
good  approximation  to  /(x).  In  applications  the  (possibly  complicated)  function  / 
is  often  replaced  by  its  linear  approximation  g  which  is  easier  to  handle. 


Example  7.17  Let  /(x)  =  fx  =  x1/2.  Consequently, 


/  1  _i  1 

fix)  =  -X  2  =  - — 

7  2  2fx 

We  want  to  find  the  linear  approximation  to  the  function  /  at  xo  =  a.  According  to 
the  formula  above  it  holds  that 

r-  1 

fx  ~  g(x)  =  fa  +  — ^(x  -  a) 

2  +Ja 


for  x  close  to  a ,  or,  alternatively  with  h  =  x  —  a, 


1 


fa  +  h  ~  fa  H - —  h  for  small  h. 

2  fa 


If  we  now  substitute  a  =  1  and  h  =  0.1,  we  obtain  the  approximation 

, —  0.1 

vTT  %  l  +  —  =  1.05. 


The  first  digits  of  the  actual  value  are  1.0488... 


Physical  interpretation  as  rate  of  change.  In  physical  applications  the  derivative 
often  plays  the  role  of  a  rate  of  change.  A  well-known  example  from  everyday  life  is 
the  velocity ,  see  Sect.  7.1.  Consider  a  particle  which  is  moving  along  a  straight  line. 
Let  s(t)  be  the  position  where  the  particle  is  at  time  t.  The  average  velocity  is  given 
by  the  quotient 


sjt)  -sjtp) 
t  -  to 


(difference  in  displacement  divided  by  difference  in  time). 


In  the  limit  t  to  the  average  velocity  turns  into  the  instantaneous  velocity 


ds  .  s(t)  —  s(to) 

v(to)  =  —(to)  =  s(to)  =  lim  - 

d t  t^t0  t  -  to 


Note  that  one  often  writes  fit)  instead  of  f!  it)  if  the  time  t  is  the  argument  of  the 
function  /.  In  particular,  in  physics  the  dot  notation  is  most  commonly  used. 
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Likewise  one  obtains  the  acceleration  by  differentiating  the  velocity 

a(t)  =  v(t)  =  s(t). 


The  notion  of  velocity  is  also  used  in  the  modelling  of  other  processes  that  vary  over 
time,  e.g.  for  growth  or  decay. 


7.4  Differentiation  Rules 

In  this  section  I  cl  denotes  an  open  interval.  We  first  note  that  differentiation  is  a 
linear  process. 

Proposition  7.18  (Linearity  of  the  derivative)  Let  /,  g  :  I  — >  R  be  two  functions 
which  are  differentiable  at  x  e  I  and  take  c  e  R.  Then  the  functions  f  +  g  and  c  •  / 
are  differentiable  at  x  as  well  and 

{fix)  +  g{x))'  =  fix)  +  g\x), 

(< cfix))'  =  cf(x). 

Proof  The  result  follows  from  the  corresponding  rules  for  limits.  The  first  statement 
is  true  because 

/(*  +h)+  gjx  +h)~  jfjx)  +  g(x))  _  fjx  +  h)  -  f(x)  gjx  +  h)  -  gjx) 

h  h  h 

fix)  cfix) 


as  h  — >  0.  The  second  statement  follows  similarly.  □ 

Linearity  together  with  the  differentiation  rule  (xm) '  =  mxm~l  for  powers  implies 
that  every  polynomial  is  differentiable.  Let 

p(x)  =  anxn  +  an- \xn~l  +  •  •  •  +  a\x  +  a^. 

Then  its  derivative  has  the  form 

/I  2 

p  (x)  =  nanxn~  +  (n  —  1  )an-\xn~  +  •  •  •  +  a\. 

For  example,  (3v7  —  4x2  +  5x  —  If  =  2lx 6  —  8v  +  5. 

The  following  two  rules  allow  one  to  determine  the  derivative  of  products  and 
quotients  of  functions  from  their  factors. 

Proposition  7.19  (Product  rule)  Let  /,  g  :  I  — >  R  be  two  functions  which  are  dif¬ 
ferentiable  at  x  €  I.  Then  the  function  f  •  g  is  differentiable  at  x  and 

{fix)  ■  gix))'  =  fix)  ■  g(x)  +  fix)  ■  g'ix). 


7.4  Differentiation  Rules 
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Proof  This  fact  follows  again  from  the  corresponding  rules  for  limits 


f(x  +  h)  ■  gjx  +  h)~  f(x)  ■  gjx) 

h 

f{x  +  h)  ■  g(x  +  h)  -  fjx )  -gjx  +h)  fix)  •  gjx  +  h)  -  fix)  ■  gix) 


h 

fix  +h)  -  fix)  gix+h)-g)x) 

- g(x  +  h)+f(x) 


h 


h 


h 


fix) 


aix) 


g'ix) 


as  h  0.  The  required  continuity  of  g  at  x  is  a  consequence  of  Application  7. 15.  □ 


Proposition  7.20  (Quotient  rule)  Let  f,  g  :  I  H  be  two  functions  differentiable 
at  x  e  I  and  g(x )  7^  0.  Then  the  quotient  ^  is  differentiable  at  the  point  x  and 


In  particular, 


fix)  V  _  ffx)  -g(x)  -  fjx)  -g'jx) 
gix) )  gix)2 


1 

gix) 


gjx) 

igix))2' 


The  proof  is  similar  to  the  one  for  the  product  rule  and  can  be  found  in  [3, 
Chap.  3.1],  for  example. 


Example  7.21  An  application  of  the  quotient  rule  to  tanx  =  - shows  that 

cos  x 

.  cos2  x  +  sin2  x  1  2 

tan  x  =  - -z - =  — -= —  =  1  +  tan  x . 

cosz  x  cosz  x 

Complicated  functions  can  often  be  written  as  a  composition  of  simpler  functions. 
For  example,  the  function 

h  :  [2,  00)  h(x)  =  ^\og(x  —  1) 

can  be  interpreted  as  h(x)  =  f  igix))  with 

/  :  [0,  00)  — >►  R  :  y  i->  g  :  [2,  00)  ->  [0,  00)  :  x  log(x  —  1). 

One  denotes  the  composition  of  the  functions  /  and  g  by  h  =  fog.  The  following 
proposition  shows  how  such  compound  functions  can  be  differentiated. 
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Proposition  7.22  (Chain  rule)  The  composition  of  two  differentiable  functions  g  : 
I  ^  B  and  f  :  B  —>  M  w  rz/so  differentiable  and 

ff(g(x))  =  f'(g(x))-g'(x). 
ax 

In  shorthand  notation  the  rule  is 

(/  °  g y  =  (/'  °  5)  •  7- 


Proof  We  write 

/(g(x  +  ft))  -  f(g(x))  gjx  +  h)  -  gjx) 
g(x  +  h)  -  g(x)  h 

f(g(x)  +k)~  f(g(x))  g(x  +  h)  -  g(x) 
k  h 

where,  due  to  the  interpretation  as  a  linear  approximation  (see  Sect.  7.3),  the  expres¬ 
sion 


1 

h 


+  *))- 


k  =  g(x  +  h)  —  g(x) 


is  of  the  form 

k  =  g'(x)h  +  R(x  +  h ,  x) 
and  tends  to  zero  itself  as  h  0.  It  follows  that 


dx 


f(g(x ))  =  lim  | 

h  — >•  0  /t 


+  A))- 


Hm  /  f(g(x)+k)  -  f  (g(x)) 
h  — >  0  \  k 

f'(g(x ))  •  </(*) 


^(x  +  A)  -  ^(x) 
h 


and  hence  the  assertion  of  the  proposition. 


□ 


The  differentiation  of  a  composite  function  /t(x)  =  f(g(x))  is  consequently  per¬ 
formed  in  three  steps: 

1.  Identify  the  outer  function  /  and  the  inner  function  g  with  h(x)  =  f(g(x)). 

2.  Differentiate  the  outer  function  /  at  the  point  g(x),  i.e.  compute  f'(y)  and  then 
substitute  y  =  g(x).  The  result  is  f'(g(x)). 

3.  Inner  derivative:  Differentiate  the  inner  function  g  and  multiply  it  with  the  result 
of  step  2.  One  obtains  h'(x)  =  f'(g(x))  •  g'(x). 

In  the  case  of  three  or  more  compositions,  the  above  rules  have  to  be  applied  recur¬ 
sively. 


7.4  Differentiation  Rules 
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Example  7.23  (a)  Let  h(x )  =  (sinx)3.  We  identify  the  outer  function  /(y)  =  y3 
and  the  inner  function  g(x)  =  sinx.  Then 

h  (x)  =  3  (sinx)  •  cosx. 

(b)  Let  h(x)  =  e~*2 .  We  identify  f(y)  =  ev  and  g(x)  =  —x2.  Thus 

h\x)  —  e_A  •  (— 2x). 

The  last  rule  that  we  will  discuss  concerns  the  differentiation  of  the  inverse  of  a 
differentiable  function. 


Proposition  7.24  (Inverse  function  rule)  Let  f  \  I  —>  J  be  bijective,  differentiable 
and  f'(y )  /  0  for  all  ye/.  Then  f~l  :  J  I  is  also  differentiable  and 


d 

dx 


1 

f'(f-Hx))' 


In  shorthand  notation  this  rule  is 


1 

/'o/-1  ' 


Proof  We  set  y  =  f  1  (v)  and  rj  =  f  Due  to  the  continuity  of  the  inverse 
function  (see  Proposition  C. 3)  we  have  that  g  — >►  y  as  v.  It  thus  follows  that 


dx 


r\o 


lim 


rl(0-rl(x) 

-  x 

=  lim  (m 

7]^y  \  g 


lim  — 

v-+y  f(v) 

/(y)V1 

y  ) 


v-y 


f(y ) 

l 


l 


f'(y)  f(f-Hx)) 


and  hence  the  statement  of  the  proposition. 


□ 


Figure  7.4  shows  the  geometric  background  of  the  inverse  function  rule:  The  slope 
of  a  straight  line  in  x -direction  is  the  inverse  of  the  slope  in  y -direction. 

If  it  is  known  beforehand  that  the  inverse  function  is  differentiable  then  its  deriva¬ 
tive  can  also  be  obtained  in  the  following  way.  One  differentiates  the  identity 

*  =  f(f~l(x)) 

with  respect  to  x  using  the  chain  rule.  This  yields 

i  =  f\r\x))  ■  try  {x) 

and  one  obtains  the  inverse  rule  by  division  by 
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y  =  /  1{x) 


k  =  (r1y(x) 


Fig.  7.4  Derivative  of  the  inverse  function  with  detailed  view  of  the  slopes 


Example  7.25  (Derivative  of  the  logarithm)  Since  y  =  log  v  is  the  inverse  function 
to  x  =  ev,  it  follows  from  the  inverse  function  rule  that 


(log  vV  =  -p* —  =  - 

V  &  )  elogx  x 


for  v  >  0.  Furthermore 


log  \x\  = 


logv,  v  >  0, 
log  (-x) ,  X  <  0, 


and  thus 


1 


(log  x)  = 

(l°g|x|)={  x  i 

(log(-x))  = 


(-x) 


1 

(-1)  =  - 


Altogether  one  obtains  the  formula 


v  >  0, 

v  <  0. 


(log  \x\)' 


1 

—  for  v  0. 

x 


For  logarithms  to  the  base  a  one  has 


loga* 


thus  (log flx)'=— 1 — . 
log  a  x  log  a 


Example  7.26  (Derivatives  of  general  power  functions)  From  =  ealogx  we 
infer  by  the  chain  rule  that 

(x“V  =  ealo%x  .  -  =  x“  •  -  =  a  x“_1. 

X  X 


Example  7.27  (Derivative  of  the  general  exponential  function)  For  a  >  0  we  have 
ax  =  eA  log<3.  An  application  of  the  chain  rule  shows  that 


(axy  =  (e*logfl)'  =  ex loga  •  log  a  =  axloga. 


7.4  Differentiation  Rules 
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Example  7.28  For  i  >  Owe  have  v*  =  e1’ logx  and  thus 

=  exlog  v  (logx  4 — ^  (logx  +  1) . 

Example  7.29  (Derivatives  of  cyclometric  functions)  We  recall  the  differentiation 
rules  for  the  trigonometric  functions  on  their  principal  branches: 


(sinvy  =  cosv  =  \/l  —  sin2  x , 

(cos  x)'  =  —  sin  v  =  — V 1  —  cos2  x, 
(tan  x)'  =  1  +  tan2  x, 


<  x  < 


7 r 

2  ' 


The  inverse  function  rule  thus  yields 


(arcsin  x)' 
(arccos  x)' 
(arctan  x)' 


1  _  1 

y/\  —  sin2  (arcsin  x)  Vl  —  x2 
-1  1 


s/ \  —  cos2  (arccos  v) 

1 

1  +  tan2  (arctan  x) 


—  1  <  v  <  1, 

—  1  <  v  <  1, 

—  OO  <  V  <  00. 


Example  7.30  (Derivatives  of  hyperbolic  and  inverse  hyperbolic  functions)  The 
derivative  of  the  hyperbolic  sine  is  readily  computed  by  invoking  the  defining  for¬ 
mula: 


(sinhv)7  = 


e  x)  =  cosh  v. 


The  derivative  of  the  hyperbolic  cosine  is  obtained  in  the  same  way;  for  differentiating 
the  hyperbolic  tangent,  the  quotient  rule  is  to  be  applied  (see  Exercise  3): 


/  /  O 

(coshv)  =  sinhv,  (tanhv)  =  1  —  tank  x. 


The  derivative  of  the  inverse  hyperbolic  sine  can  be  computed  by  means  of  the  inverse 
function  rule: 

.1  1  1 

(arsinh  x)  =  - ; - =  —  =  y 

cosh(arsinh  x)  _|_  sinh2 (arsinh  x)  Vl  +  x2 


for  xgR,  where  we  have  used  the  identity  cosh2  v  —  sinh2  v  =  1 .  In  a  similar  way, 
the  derivatives  of  the  other  inverse  hyperbolic  functions  can  be  computed  on  their 
respective  domains  (Exercise  3): 


(arcosh  x)' 
(artanh  x)' 


1 


Vv2  —  1 

1 


l-x2’ 


X  >  1, 

—  1  <  V  <  1. 
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Table  7.1  Derivatives  of  the  elementary  functions  (a  €  R,  <2  >  0) 


fix) 

1 

e* 

ax 

log  |x| 

logax 

f'M 

0 

axa~l 

e* 

ax  log  a 

1 

1 

X 

x  log  a 

fix) 

sinx 

cosx 

tanx 

arc  sin  x 

arccos  x 

arctan  x 

f'M 

cosx 

—  sinx 

1  +  tan2  x 

1 

-1 

1 

Vl  —  x2 

Vl  —  x2 

1  +  X2 

fix ) 

sinhx 

coshx 

tanhx 

arsinh  x 

arcosh  x 

artanh  x 

f'M 

coshx 

sinhx 

1  —  tanh2  x 

1 

1 

1 

Vl  +  X2 

Vx2  —  1 

1  —  X2 

The  derivatives  of  the  most  important  elementary  functions  are  collected  in  Table  7. 1 . 
The  formulas  are  valid  on  the  respective  domains. 


7.5  Numerical  Differentiation 


In  applications  it  often  happens  that  a  function  can  be  evaluated  for  arbitrary  argu¬ 
ments,  but  no  analytic  formula  is  known  which  represents  the  function.  This  situation, 
for  example,  arises  if  the  dependent  variable  is  determined  using  a  measuring  instru¬ 
ment,  e.g.  the  temperature  at  a  given  point  as  a  function  of  time. 

The  definition  of  the  derivative  as  a  limit  of  difference  quotients  suggests  that 
the  derivative  of  such  functions  can  be  approximated  by  an  appropriate  difference 
quotient 


f\a) 


f(a  +  h) 


h 


f(a ) 


The  question  is  how  small  h  should  be  chosen.  In  order  to  decide  this  we  will  first 
carry  out  a  numerical  experiment. 


Experiment  7.31  Use  the  above  formula  to  approximate  the  derivative  fr (a)  of 
fix)  =  e*  at  a  =  1.  Consider  different  values  of  h ,  for  example  for  h  =  10-J  with 
7=0,1,...,  16.  One  expects  a  value  close  to  e  =  2.71828...  as  result.  Typical 
outcomes  of  such  an  experiment  are  listed  in  Table  7.2. 


One  sees  that  the  error  initially  decreases  with  h ,  but  increases  again  for  smaller 
h.  The  reason  lies  in  the  representation  of  numbers  on  a  computer.  The  experiment 
was  carried  out  in  IEEE  double  precision  which  corresponds  to  a  relative 
machine  accuracy  of  eps  ~  10~16.  The  experiment  shows  that  the  best  result  is 
obtained  for 


h  ~  ^/eps  ~  10  8. 
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Table  7.2  Numerical  differentiation  of  the  exponential  function  at  a  =  1  using  a  one-sided  differ¬ 
ence  quotient.  The  numerical  results  and  errors  are  given  as  functions  of  h 


h 

Value 

Error 

1 . 000E-000 

4 . 67077427047160 

1. 952 492442 0125 6E- 000 

1 . 000E-001 

2.85884195487388 

1. 4 05 6 012 6414 83 8E- 001 

1 . 000E-002 

2 .73191865578714 

1.36368273280976E-002 

1 . 000E-003 

2.71964142253338 

1.35959407433051E-003 

1 . 000E-004 

2 .71841774708220 

1. 3 59 18 62 3 152 43  IE- 004 

1 . 000E-005 

2 .71829541994577 

1. 3 59 14 8 672 1864 5E- 005 

1 . 000E-006 

2 .71828318752147 

1 . 35906242526573E-006 

1 . 000E-007 

2.71828196740610 

1 . 3 89 47 053 418 54 8E- 007 

1 . 000E-008 

2.71828183998415 

1. 152 51 088 672 60 8E- 008 

1 . 000E-009 

2.71828219937549 

3 . 7 09 16445 113 77 8E- 007 

1.000E-010 

2 .71828349976758 

1. 67130853068187E-006 

1.000E-011 

2.71829650802524 

1 . 46795661959409E-005 

1.000E-012 

2.71866817252997 

3 . 8 63 44 07 0 92 441 6E- 004 

1.000E-013 

2 .71755491373926 

-7 .2 69 147 197 8370 0E- 004 

1 . 000E-014 

2.73058485544819 

1.2 3 03 02 69 89 147 IE- 002 

1 . 000E-015 

3.16240089670572 

4 . 44119 0 682 4 6 67 4E- 001 

1 . 000E-016 

1.44632569809566 

-1. 2 719 5613 03 63 3 8E-0 00 

This  behaviour  can  be  explained  by  using  Taylor  expansion .  In  Chap.  12  we  will 
derive  the  formula 


h 2 

f(a  +h)  =  f  (a)  +  hf\a )  +  —  /"(£), 

where  £  denotes  an  appropriate  point  between  a  and  a  +  h.  (The  value  of  £  is  usually 
not  known.)  Thus,  after  rearranging,  we  get 

,  /(a+/i)  — /(a)  h  .. 

f  (a)  =  — - ;  -  7r/"(0. 

h  2 

The  discretisation  error ,  i.e.  the  error  which  arises  from  replacing  the  derivative 
by  the  difference  quotient,  is  proportional  to  h  and  decreases  linearly  with  h.  This 
behaviour  can  also  be  seen  in  the  numerical  experiment  for  h  between  1CP2  and 
1(T8. 

For  very  small  h  rounding  errors  additionally  come  into  play.  As  we  have  seen  in 
Sect.  1.4  the  calculation  of  f  (a)  on  a  computer  yields 


rd(/ (a))  =  f(a)  •  (1  +  e)  =  f(a)  +  ef(a) 
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Fig.  7.5  Approximation  of 
the  tangent  by  a  symmetric 
secant 


with  \e\  <  eps.  The  rounding  error  turns  out  to  be  proportional  to  eps lh  and 
increases  dramatically  for  small  h.  This  behaviour  can  be  seen  in  the  numerical 
experiment  for  h  between  10~8  and  10_  16 . 

The  result  of  the  numerical  derivative  using  the  one-sided  difference  quotient 


f\a) 


f(a+h) 


h 


f(a) 


is  then  most  precise  if  discretisation  and  rounding  error  have  approximately  the  same 
magnitude,  so  if 

eps  , _  _o 

h  ~  -  or  h  ~  Jeps  ~  10  . 

h 

In  order  to  calculate  the  derivative  of  f'(a )  one  can  also  use  a  secant  placed 
symmetrically  around  (a,  f(a )),  i.e. 


f\a)  —  lim 
h^O 


f(a  +  h)  -  f(a 
2  h 


This  suggests  the  symmetric  formula 


f\a) 


f(a  +  h)~  f  (a 
2  h 


This  approximation  is  called  symmetric  difference  quotient  (Fig.  7.5). 

To  analyse  the  accuracy  of  the  approximation,  we  need  the  Taylor  series  from 
Chap.  12: 


h 2  h 3 

f(a  +h)  =  f  (a)  +  hf'(a)  +  —  f"  (a)  +  —/'"(a)  + 

2  6 

If  one  replaces  h  by  —  h  in  this  formula 


h 2  h 3 

f(a  -h)  =  f  (a)  -  hf'(a)  +  —f"(a)  -  —/'"(a)  + 

2  6 
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Table  7.3  Numerical  differentiation  of  the  exponential  function  at  a  =  1  using  a  symmetric  differ¬ 
ence  quotient.  The  numerical  results  and  errors  are  given  as  functions  of  h 


h 

Value 

Error 

1 . 000E-000 

3 . 19452804946533 

4 . 7 624622 100 628 0E- 001 

1 . 000E-001 

2.72281456394742 

4 . 53 2 73 548 83 73 07E- 003 

1 . 000E-002 

2 . 71832713338270 

4.53049236583958E-005 

1 . 000E-003 

2.71828228150582 

4 . 53 04 677 07 6 52 9 7E- 007 

1 . 000E-004 

2.71828183298958 

4 . 53 053283777 649E-009 

1 . 000E-005 

2.71828182851255 

5.35020916458961E-011 

1 . 000E-006 

2.71828182834134 

-1. 177 04 512 8 0313 0E- 010 

1 . 000E-007 

2.71828182903696 

5 . 77 9 19 49 0 04 12 6 8E- 010 

1 . 000E-008 

2 . 71828181795317 

-1. 05058792776447E-008 

1.000E-009 

2.71828182478364 

-3 . 67 54 0 57 57 5194 6E- 009 

1.000E-010 

2.71828199164235 

1. 63 183 3 08 643 511E- 007 

1 . 000E-011 

2.71829103280427 

9 .20434522511116E-006 

1.000E-012 

2.71839560410381 

1 . 13 77 5 6447 6 15 6 0E- 004 

and  takes  the  difference,  one  obtains 


f(a  +  h) 


h' 


f  (a  —  h)  =  2hf(a)  +  2—f"\a)  + 

6 


and  furthermore 


f\a) 


f(a+h )  -  f(a  -h) 
2  h 


/r 


f"\a)  + 


In  this  case  the  discretisation  error  is  hence  proportional  t  oh2,  while  the  rounding 
error  is  still  proportional  to  eps  /h. 

The  symmetric  procedure  thus  delivers  the  best  results  for 


h2  & 


eps 

~h~ 


or  h  ~  ^/eps, 


respectively.  We  repeat  Experiment 7.31  with  f(x )  =  ex,  a  =  1  and  h  =  10-J  for 
j  =  0, . . . ,  12.  The  results  are  listed  in  Table 7.3. 

As  expected  one  obtains  the  best  result  for  h  ~  10-5 .  The  obtained  approximation 
is  more  precise  than  that  of  Table 7.2.  Since  symmetric  procedures  generally  give 
better  results,  symmetry  is  an  important  concept  in  numerical  mathematics. 

Numerical  differentiation  of  noisy  functions.  In  practice  it  often  occurs  that  a 
function  which  has  to  be  differentiated  consists  of  discrete  data  that  are  addition¬ 
ally  perturbed  by  a  noise.  The  noise  represents  small  measuring  errors  and  behaves 
statistically  like  random  numbers. 
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Fig.  7.6  The  left  picture  shows  random  noise  which  masks  the  data.  The  noise  is  modelled  by  801 
normally  distributed  random  numbers.  The  frequencies  of  the  chosen  random  numbers  can  be  seen 
in  the  histogram  in  the  right  picture.  For  comparison,  the  (scaled)  density  of  the  corresponding 
normal  distribution  is  given  there  as  well 


Example  7.32  Digitising  a  line  of  a  picture  by  /  +  1  pixels  produces  a  function 

/:  {0,  1,  M  :  j  i->  /O’)  =  fj  =  brightness  of  the  jth  pixel. 

In  order  to  find  an  edge  in  the  picture,  where  the  brightness  locally  changes  very 
rapidly,  this  function  has  to  be  differentiated. 

We  consider  a  concrete  example.  Suppose  that  the  picture  information  consists  of 
the  function 

g  :  [a,  b]  ^  W  :  x  i->  g{x)  =  —  2x3  +  4x 
with  a  =  —  2  and  b  =  2.  Let  Ax  be  the  distance  between  two  pixels  and 

b  —  a 

J  =  - 

Ax 

denote  the  total  number  of  pixels  minus  1.  We  choose  Ax  =  1/200  and  thus  obtain 
/  =  800.  The  actual  brightness  of  the  jth  pixel  would  then  be 

9j  =  g(a  +  jAx),  0  <  j  <  J. 

However,  due  to  measuring  errors  the  measuring  instrument  supplies 

fj  =  9j  +  £j » 

where  sj  are  random  numbers.  We  choose  normally  distributed  random  numbers 
with  expected  value  0  and  variance  2.5  •  10~5  for  £j,  see  Fig. 7.6.  For  an  exact 
definition  of  the  notions  of  expected  value  and  variance  we  refer  to  the  literature,  for 
instance  [18]. 

These  random  numbers  can  be  generated  in  MATLAB  using  the  command 


randn ( 1 , 8  01 ) *sqrt ( 2 . 5e-5 )  . 
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Fig.  7.7  Numerically  obtained  derivative  of  a  noisy  function  /,  consisting  of  801  data  values  (left); 
derivative  of  the  same  function  after  filtering  using  a  Gaussian  filter  (middle)  and  after  smoothing 
using  splines  (right) 


Differentiating  /  using  the  previous  rules  generates 


fi 


j 


fj  fj- 1  9j  9j- 1  .  ej  ej- 1 


Ax 


Ax 


+ 


Ax 


and  the  part  with  g  gives  the  desired  value  of  the  derivative,  namely 


9j  ~  9j- 1  9 (a  +  jAx)  -  g(a  +  j  Ax  -  Ax) 


Ax 


Ax 


g\a  +  jAx). 


The  sequence  of  random  numbers  results  in  a  non- differentiable  graph.  The  expres¬ 
sion 

£j  ~  £j~  1 

Ax 

is  proportional  to  J  •  maxo<j</  \£j\.  The  errors  become  dominant  for  large  /,  see 
Fig.  7.7,  left  picture. 

To  still  obtain  reliable  results,  the  data  have  to  be  smoothed  before  differentiating. 
The  simplest  method  is  a  so-called  convolution  with  a  Gaussian  filter  which  amounts 
to  a  weighted  averaging  of  the  data  (Fig.  7.7,  middle).  Alternatively  one  can  also  use 
splines  for  smoothing,  for  example  the  routine  csaps  in  MATLAB.  For  the  right 
picture  in  Fig.  7.7  this  method  has  been  used. 


Experiment  7.33  Generate  Fig.  7.7  using  the  MATLAB  program  mat07_l  .m  and 
investigate  the  influence  of  the  choice  of  random  numbers  and  the  smoothing  param¬ 
eter  in  csaps  on  the  result. 


7.6  Exercises 

1.  Compute  the  first  derivative  of  the  functions 

o  1  1 

f{x)  =  x  ,  g(t)  =  h(x)  =  cosv,  k(x)  =  £(f)  =  tanf 

tz  ffx 


using  the  definition  of  the  derivative  as  a  limit. 
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2.  Compute  the  first  derivative  of  the  functions 


a(x)  = 


x2-l 


xi+2x+\ ’  b(x)  =  (x3  —  l)sin2v,  c(7)  =  Vl  +  t2  arctan  t, 
d(t)  =  f2ecosfi“+1),  ^(x)=v2sinx,  /(j)  =  log  (,  +  yrr^ 

Check  your  results  with  maple . 

3.  Derive  the  remaining  formulas  in  Example  7.30.  Start  by  computing  the  deriva¬ 
tives  of  the  hyperbolic  cosine  and  hyperbolic  tangent.  Use  the  inverse  function 
rule  to  differentiate  the  inverse  hyperbolic  cosine  and  inverse  hyperbolic  tangent. 

4.  Compute  an  approximation  of  V34  by  replacing  the  function  f(x)  =  +Jx  at 
x  =  36  by  its  linear  approximation.  How  accurate  is  your  result? 

5.  Find  the  equation  of  the  tangent  line  to  the  graph  of  the  function  y  =  f(x) 
through  the  point  (vq,  /(jcq)),  where 


fix)  =  -  +  - - 

2  log  v 


and  (a)  vq  =  e;  (b)  vq  =  e  . 


6.  Sand  runs  from  a  conveyor  belt  onto  a  heap  with  a  velocity  of  2  m3/min.  The 
sand  forms  a  cone-shaped  pile  whose  height  equals  ^  of  the  radius.  With  which 
velocity  does  the  radius  grow  if  the  sand  cone  has  a  diameter  of  6  m? 

Hint.  Determine  the  volume  V  as  a  function  of  the  radius  r,  consider  V  and  r  as 
functions  of  time  t  and  differentiate  the  equation  with  respect  to  t .  Compute  f . 

7.  Use  the  Taylor  series 

y(x  +  h)  =  y(x)  +  hy'(x)  +  -y"{x )  +  - l—y"'{x )  +  E y(4)(x )  H - 

2  6  24 

to  derive  the  formula 


/'(*)  = 


y(x  +  h)  —  2y(x)  +  y(x  —  h)  h 


h 2 


-  —y{A\x)  + 

ny 


and  read  off  from  that  a  numerical  method  for  calculating  the  second  derivative. 
The  discretisation  error  is  proportional  to  h 2,  and  the  rounding  error  is  propor¬ 
tional  to  eps//r.  By  equating  the  discretisation  and  the  rounding  error  deduce 
the  optimal  step  size  h.  Check  your  considerations  by  performing  a  numerical 
experiment  in  MATLAB,  computing  the  second  derivative  of  y(x)  =  e2A  at  the 
point  x  =  l. 

8.  Write  a  MATLAB  program  which  numerically  differentiates  a  given  function  on  a 
given  interval  and  plots  the  function  and  its  first  derivative.  Test  your  program 
on  the  functions 


and 


fix)  =  cos  x ,  0  <  v  <  67 r, 


—  cos(3x) 


g(x)  =  e 


9 


0  <  v  <  2. 
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9.  Show  that  the  nth  derivative  of  the  power  function  y  =  xn  equals  n\  for  n 
1.  Verify  that  the  derivative  of  order  n  +  1  of  a  polynomial  p(x)  =  anxn 
an-\xn~l  +  •  •  •  +  a\x  +  ao  of  degree  n  equals  zero. 

10.  Compute  the  second  derivative  of  the  functions 

f(x)  =  e-*2,  g(x)  =  log  (x  +  -/  1  +  x2) ,  h{x)  —  log  *  +  |  . 


+  IV 


® 
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Applications  of  the  Derivative 


This  chapter  is  devoted  to  some  applications  of  the  derivative  which  form  part  of 
the  basic  skills  in  modelling.  We  start  with  a  discussion  of  features  of  graphs.  More 
precisely,  we  use  the  derivative  to  describe  geometric  properties  like  maxima,  minima 
and  monotonicity.  Even  though  plotting  functions  with  MATLAB  or  maple  is  simple, 
understanding  the  connection  with  the  derivative  is  important,  for  example,  when  a 
function  with  given  properties  is  to  be  chosen  from  a  particular  class  of  functions. 

In  the  following  section  we  discuss  Newton’s  method  and  the  concept  of  order 
of  convergence.  Newton’s  method  is  one  of  the  most  important  tools  for  computing 
zeros  of  functions.  It  is  nearly  universally  in  use. 

The  final  section  of  this  chapter  is  devoted  to  an  elementary  method  from  data 
analysis.  We  show  how  to  compute  a  regression  line  through  the  origin.  There  are 
many  areas  of  application  that  involve  linear  regression.  This  topic  will  be  developed 
in  more  detail  in  Chap.  18. 


8.1  Curve  Sketching 

In  the  following  we  investigate  some  geometric  properties  of  graphs  of  functions 
using  the  derivative:  maxima  and  minima,  intervals  of  monotonicity  and  convexity. 
We  further  discuss  the  mean  value  theorem  which  is  an  important  technical  tool  for 
proofs. 

Definition  8.1  A  function  /:  [a,  b]  —>  R  has 
(a)  a  global  maximum  at  vo  e  [a ,  b ]  if 

/(■*)  <  f{x  o)  for  all  x  €  [a,  b]; 
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Fig.  8.1  Minima  and 
maxima  of  a  function 


(b)  a  local  maximum  at  vo  £  [«,  b],  if  there  exists  a  neighbourhood  U£(x o)  so  that 

fix)  <  /(xo)  for  all  v  g  U£(xo)  D  [<z,  &]. 

The  maximum  is  called  strict  if  the  strict  inequality  fix)  <  fix  o)  holds  in  (a)  or 
(b)  for  x  ^  x$. 

The  definition  for  minimum  is  analogous  by  inverting  the  inequalities.  Maxima 
and  minima  are  subsumed  under  the  term  extrema.  Figure  8.1  shows  some  possible 
situations.  Note  that  the  function  there  does  not  have  a  global  minimum  on  the  chosen 
interval. 

For  points  vo  in  the  open  interval  (a,  b)  one  has  a  simple  necessary  condition  for 
extrema  of  differentiable  functions: 


Proposition  8.2  Let  vo  G  ( a ,  b)  and  f  be  differentiable  at  xq.  Iff  has  a  local  max¬ 
imum  or  minimum  at  vq  then  f'(xo)  =  0. 


Proof  Due  to  the  differentiability  of  /  we  have 


,, .  ,  f{x 0  +  h)~  f{x o)  fix 0  +h)~  fix o) 

/  (xq)  =  lim  - =  lim  - 

J  h->  0+  h  A — >0 —  h 


In  the  case  of  a  maximum  the  slope  of  the  secant  satisfies  the  inequalities 


fix 0  +  h)~  fjx o) 
h 

fix  0  +  h)~  fjx  o) 
h 


if  h  >  0, 
if  h  <  0. 


Consequently  the  limit  f\x o)  has  to  be  greater  than  or  equal  to  zero  as  well  as 
smaller  than  or  equal  to  zero,  thus  necessarily  f\x o)  =  0.  □ 


The  function  fix)  =  v3,  whose  derivative  vanishes  at  v  =  0,  shows  that  the  con¬ 
dition  of  the  proposition  is  not  sufficient  for  the  existence  of  a  maximum  or  minimum. 

The  geometric  content  of  the  proposition  is  that  in  the  case  of  differentiability  the 
graph  of  the  function  has  a  horizontal  tangent  at  a  maximum  or  minimum.  A  point 
xq  G  ( a ,  b)  where  f\x o)  =  0  is  called  a  stationary  point. 


8.1  Curve  Sketching 
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Fig.  8.2  The  mean  value 
theorem 


Remark  8.3  The  proposition  shows  that  the  following  point  sets  have  to  be  checked 
in  order  to  determine  the  maxima  and  minima  of  a  function  /:  [a,  b]  — >  R: 

(a)  the  boundary  points  xo  =  a,  xo  =  b; 

(b)  points  xo  e  ( a ,  b)  at  which  /  is  not  differentiable; 

(c)  points  xo  e  ( a ,  Z?)  at  which  /  is  differentiable  and  f'(x o)  =  0. 

The  following  proposition  is  a  useful  technical  tool  for  proofs.  One  of  its  applica¬ 
tions  lies  in  estimating  the  error  of  numerical  methods.  Similarly  to  the  intermediate 
value  theorem,  the  proof  is  based  on  the  completeness  of  the  real  numbers.  We  are  not 
going  to  present  it  here  but  instead  refer  to  the  literature,  for  instance  [3,  Chap.  3.2]. 

Proposition  8.4  (Mean  value  theorem)  Let  f  be  continuous  on  [a ,  b]  and  differ¬ 
entiable  on  (a,  b).  Then  there  exists  a  point  £  e  (a,  b)  such  that 

m 

b  —  a 

Geometrically  this  means  that  the  tangent  at  £  has  the  same  slope  as  the  secant 
through  (a,  f(a)),  (b,  fib)).  Figure  8.2  illustrates  this  fact. 

We  now  turn  to  the  description  of  the  behaviour  of  the  slope  of  differentiable 
functions. 

Definition  8.5  A  function  /:  I  — >  R  is  called  monotonically  increasing ,  if 

x\  <  x2  =>  f(x i)  <  f(x2) 

for  all  x\,  x2  c  I .  It  is  called  strictly  monotonically  increasing ,  if 

x\  <x2  fix i)  <  f(x2). 

A  function  /  is  said  to  be  (strictly)  monotonically  decreasing,  if  —  /  is  (strictly) 
monotonically  increasing. 

Examples  of  strictly  monotonically  increasing  functions  are  the  power  functions 
x  \-^  xn  with  odd  powers  n  \  a  monotonically,  but  not  strictly  monotonically  increas¬ 
ing  function  is  the  sign  function  v  i->  sign  x,  for  instance.  The  behaviour  of  the  slope 
of  a  differentiable  function  can  be  described  by  the  sign  of  the  first  derivative. 
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Fig.  8.3  Local  maximum 


Proposition  8.6  For  differentiable  functions  f:  (a,  b)  ->  R  the  following  implica¬ 
tions  hold: 


f  >  0  on  (a,  b) 
a  ff>0on  ( a ,  b) 


/  A  monotonic  ally  increasing ; 

=>>  /  zx  strictly  monotonically  increasing. 


(h)  f  <0  on  (a,  b) 
f'<0on  ( a ,  Z?) 


<£>  /  is  monotonically  decreasing ; 

=>>  /  A  strictly  monotonically  decreasing. 


Proof  (a)  According  to  the  mean  value  theorem  we  have  f(xf)  —  fix  i)  =  /'(£)  • 
(v2  —  vi)  for  a  certain  Z;  g  (a,b).  Ifvi  <  V2  and  /'(£)  >  0then/(v2)  —  /(vi)  >  0. 
If  /  (0  >  0  then  f  ixf)  —  fix  t)  >  0.  Conversely 


/r(x)  =  lim 

/i^O 


/(*  +  fe)  ~  /(^) 

h 


>0, 


if  /  is  increasing.  The  proof  for  (b)  is  similar. 


□ 


Remark  8.7  The  example  fix)  =  x3  shows  that  /  can  be  strictly  monotonically 
increasing  even  if  f'  =  0  at  isolated  points. 


Proposition  8.8  (Criterion  for  local  extrema)  Let  f  be  differentiable  on  ( a,b ), 
vq  G  ( a ,  b)  and  f'(x o)  =  0.  Then 


,  ,  f(x)  >0  for  x  <  x0 
f'(x)  <  0  for  v  >  vo 

th)  f'^<0  for  x  <  xo 

f'(x)  >  0  for  x  >  vo 


/  has  a  local  maximum  in  vo, 
/  has  a  local  minimum  in  vq. 


Proof  The  proof  follows  from  the  previous  proposition  which  characterises  the 
monotonic  behaviour  as  shown  in  Fig.  8.3.  □ 

Remark  8.9  (Convexity  and  concavity  of  a  function  graph)  If  f"  >  0  holds  in  an 
interval  then  fr  is  monotonically  increasing  there.  Thus  the  graph  of  /  is  curved  to 
the  left  or  convex.  On  the  other  hand,  if  / "  <  0,  then  f  is  monotonically  decreasing 
and  the  graph  of  /  is  curved  to  the  right  or  concave  (see  Fig.  8.4).  A  quantitative 
description  of  the  curvature  of  the  graph  of  a  function  will  be  given  in  Sect.  14.2. 


8.1  Curve  Sketching 
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Fig.  8.4  Convexity/ 
concavity  and  second 
derivative 


Let  vo  be  a  point  where  f\x o)  =  0.  If  f!  does  not  change  its  sign  at  xo,  then  vo  is 
an  inflection  point.  Here  /  changes  from  positive  to  negative  curvature  or  vice  versa. 

Proposition  8.10  (Second  derivative  criterion  for  local  extrema)  Let  f  be  twice 
continuously  differentiable  on  (a,  b),  vo  E  (a,  b)  and  ff(x o)  =  0. 

(a)  lff"{x o)  >  0  then  f  has  a  local  minimum  at  vo- 

(b)  ///"(xo)  <  0  then  f  has  a  local  maximum  at  xq. 

Proof  (a)  Since  f"  is  continuous,  f"(x)  >  0  for  all  i  in  a  neighbourhood  of 
vo.  According  to  Proposition 8.6,  /'  is  strictly  monotonically  increasing  in  this 
neighbourhood.  Because  of  f'(x o)  =  0  this  means  that  f'(x o)  <  0  for  v  <  vo  and 
f'(x )  >  0  for  v  >  vo;  according  to  the  criterion  for  local  extrema,  vo  is  a  minimum. 
The  assertion  (b)  can  be  shown  similarly.  □ 

Remark  8.11  If  f"(x o)  =  0  there  can  either  be  an  inflection  point  or  a  minimum  or 

maximum.  The  functions  /(v)  =  xn,  n  =  3,  4,  5,  . . .  supply  a  typical  example.  In 

fact,  they  have  for  n  even  a  global  minimum  at  v  =  0,  and  an  inflection  point  for  n  odd. 
More  general  functions  can  easily  be  assessed  using  Taylor  expansion.  An  extreme 
value  criterion  based  on  this  expansion  will  be  discussed  in  Application  12.14. 

One  of  the  applications  of  the  previous  propositions  is  curve  sketching ,  which 
is  the  detailed  investigation  of  the  properties  of  the  graph  of  a  function  using 
differential  calculus.  Even  though  graphs  can  easily  be  plotted  in  MATLAB  or  maple  it 
is  still  often  necessary  to  check  the  graphical  output  at  certain  points  using  analytic 
methods. 

Experiment  8.12  Plot  the  function 

y  =  x (sign  v  -  1) (v  +  l)3  +  (sign(v  -  1)  +  l)((v  -  2)4  -  1/2) 

on  the  interval  —  2  <  v  <  3  and  try  to  read  off  the  local  and  global  extrema,  the 
inflection  points  and  the  monotonic  behaviour.  Check  your  observations  using  the 
criteria  discussed  above. 

A  further  application  of  the  previous  propositions  consists  in  finding  extrema , 
i.e.  solving  one-dimensional  optimisation  problems.  We  illustrate  this  topic  using  a 
standard  example. 
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Example  8.13  Which  rectangle  with  a  given  perimeter  has  the  largest  area?  To 
answer  this  question  we  denote  the  lengths  of  the  sides  of  the  rectangle  by  v  and  y. 
Then  the  perimeter  and  the  area  are  given  by 


U  =  2x  +  2  y,  F  =  xy. 

Since  U  is  fixed,  we  obtain  y  =  U / 2  —  x,  and  from  that 


F  =  x(U/2-x), 

where  x  can  vary  in  the  domain  0  <  x  <  U/2.  We  want  to  find  the  maximum  of 
the  function  F  on  the  interval  [0,  U /2\.  Since  F  is  differentiable,  we  only  have  to 
investigate  the  boundary  points  and  the  stationary  points.  At  the  boundary  points 
v  =  0  and  x  =  U/2  we  have  F( 0)  =  0  and  F(U /2)  =  0.  The  stationary  points  are 
obtained  by  setting  the  derivative  to  zero 


F\x)  =  U/2  -  2x  =  0, 


which  brings  us  to  x  =  U/4  with  the  function  value  F(U /4)  =  U2  / 16. 

As  result  we  get  that  the  maximum  area  is  obtained  at  x  =  U /4,  thus  in  the  case 
of  a  square. 


8.2  Newton's  Method 

With  the  help  of  differential  calculus  efficient  numerical  methods  for  computing 
zeros  of  differentiable  functions  can  be  constructed.  One  of  the  basic  procedures  is 
Newton ’s  method  which  will  be  discussed  in  this  section  for  the  case  of  real-valued 
functions  /:  D  c  R  — >  R. 

First  we  recall  the  bisection  method  discussed  in  Sect.  6.3.  Consider  a  continuous, 
real-valued  function  /  on  an  interval  [ a ,  b]  with 

f(a)  <  0,  f{b)  >  0  or  fia)  >  0,  f(b)  <  0. 

With  the  help  of  continued  bisection  of  the  interval,  one  obtains  a  zero  £  of  / 
satisfying 


a  =  a\  <  a2  <  ^3  <  •  •  •  <  £  <  •  •  •  <  Z?3  <  Z?2  <  b\  =  b. 


where 


\bn+\  &n+ 1 


1 

2  I  tln 


1 

^  I  bn  —  ]  a  ii — ] 


1 

—  \b\  —  ail. 
2n' 


!I.  Newton,  1642-1727. 
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If  one  stops  after  n  iterations  and  chooses  an  or  bn  as  approximation  for  £  then  one 
gets  a  guaranteed  error  bound 

|error|  <  Lp{n )  =  \bn  —  an\. 


Note  that  we  have 


1 

<p(n  +  1)  =  -  ip(n). 


The  error  thus  decays  with  each  iteration  by  (at  least)  a  constant  factor  A ,  and  one 
calls  the  method  linearly  convergent.  More  generally,  an  iteration  scheme  is  called 
convergent  of  order  a  if  there  exist  error  bounds  (( p(n))n>\  and  a  constant  C  >  0 
such  that 


lim 

n^oo 


(f(n  +  1) 
(<, P(n))a 


For  sufficiently  large  n ,  one  thus  has  approximately 


<p(n  +  1)  «  C(Lp(n))a. 


Linear  convergence  (a  =  1)  therefore  implies 

( pin  +  1)  ~  C(p(n)  ~  C2(p(n  —  1)  ~  ...  ~  Cn  Lp{  1). 


Plotting  the  logarithm  of  cp(n)  against  n  (semi-logarithmic  representation,  as  shown 
for  example  in  Fig.  8.6)  results  in  a  straight  line: 

log  (p(n  -p  1)  ss  n  log  C  +  log  ^(1). 


IfC  <  1  then  the  error  bound  cp(n  +  1)  tends  to  0  and  the  number  of  correct  decimal 
places  increases  with  each  iteration  by  a  constant.  Quadratic  convergence  would 
mean  that  the  number  of  correct  decimal  places  approximately  doubles  with  each 
iteration. 

Derivation  of  Newton’s  method.  The  aim  of  the  construction  is  to  obtain  a  procedure 
that  provides  quadratic  convergence  (a  =  2),  at  least  if  one  starts  sufficiently  close 
to  a  simple  zero  £  of  a  differentiable  function.  The  geometric  idea  behind  Newton’s 
method  is  simple:  Once  an  approximation  xn  is  chosen,  one  calculates  xn+\  as  the 
intersection  of  the  tangent  to  the  graph  of  /  through  (xn,  f(xn))  with  the  v-axis,  see 
Fig.  8.5.  The  equation  of  the  tangent  is  given  by 

y  =  f(xn )  +  f'(xn)(x  -  Xn). 


The  point  of  intersection  with  the  v-axis  is  obtained  from 

0  =  f(xn)  +  f'(xn)(xn+ 1  -  x„). 
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Fig.  8.5  Two  steps  of 
Newton’s  method 


thus 


Xn+1 


f(Xn ) 

/'(*„)’ 


n  >  1. 


Obviously  it  has  to  be  assumed  that  f\xn)  /=  0.  This  condition  is  fulfilled,  if  /'  is 
continuous,  /'(£)  7^  0  and  is  sufficiently  close  to  the  zero  £. 


Proposition  8.14  (Convergence  of  Newton’s  method)  Let  f  be  a  real-valued  func¬ 
tion,  twice  differentiable  with  a  continuous  second  derivative.  Further,  let  /(Q  =  0 
and  /'(£)  /  0.  77^  a  neighbourhood  U£(0  such  that  Newton's  method 

converges  quadratically  to  ffor  every  starting  value  x\  e  U£(0- 


Proof  Since  f'(0  /  0  and  /'  is  continuous,  there  exist  a  neighbourhood  Us(0  and 
a  bound  m  >  0  so  that  \f'(x)\  >  m  for  all  v  e  U$(0-  Applying  the  mean  value 
theorem  twice  gives 


l-^n  +  l  Cl  — 


n 


f(Xn )  -  /(O 


—  l-^«  Cl 


1  - 


/'(**) 

f'(Xn) 


—  \X 


n 


-  Cl 


I  f'(Xn)-fm 

\f\xn)\ 


—  \xn  Cl 


2  ircoi 

I  /'(*„)! 


with  r]  between  xn  and  £  and  (  between  xn  and  77.  Let  M  denote  the  maximum  of 
| /"|  on  Us  (O'  Under  the  assumption  that  all  iterates  xn  lie  in  the  neighbourhood 
Us  (0^  we  obtain  the  quadratic  error  bound 


,  M  ?  M 

<p(n  +  1)  =  \xn+l  ~  Cl  <  \xn  -  Cl  —  =  (v(n))  — 

m  m 

for  the  error  cp(n)  =  \xn  —  £|.  Thus,  the  assertion  of  the  proposition  holds  with 
the  neighbourhood  U$(0-  Otherwise  we  have  to  decrease  the  neighbourhood  by 
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choosing  an  e  <  5  which  satisfies  the  inequality  <  l.  Then 

\xn-€\<e  =>  \xn+ 1  -€\<S  —  <  e. 

m 

This  means  that  if  an  approximate  value  xn  lies  in  £/e(£)  then  so  does  the  subsequent 
value  xn+ \.  Since  £4(0  C  U$( 0,  the  quadratic  error  estimate  from  above  is  still 
valid.  Thus  the  assertion  of  the  proposition  is  valid  with  the  smaller  neighbourhood 

£4(0.  □ 

Example  8.15  In  computing  the  root  £  =  Ifl  of  v3  —  2  =  0,  we  compare  the  bisec¬ 
tion  method  with  starting  interval  [—2,2]  and  Newton’s  method  with  starting  value 
x\  =  2.  The  interval  boundaries  [an,  bn ]  and  the  iterates  xn  are  listed  in  Tables  8.1 
and  8.2,  respectively.  Newton’s  method  gives  the  value 

v7  =  1.25992104989487 

correct  to  14  decimal  places  after  only  six  iterations. 


Table  8.1  Bisection  method  for  calculating  the  third  root  of  2 


n 

an 

fon 

Error 

1 

-2 .00000000000000 

2 . 00000000000000 

4 . 00000000000000 

2 

0 .00000000000000 

2 . 00000000000000 

2 . 00000000000000 

3 

1.00000000000000 

2 . 00000000000000 

1.00000000000000 

4 

1.00000000000000 

1.50000000000000 

0.50000000000000 

5 

1.25000000000000 

1.50000000000000 

0.25000000000000 

6 

1.25000000000000 

1.37500000000000 

0.12500000000000 

7 

1.25000000000000 

1.31250000000000 

0 . 06250000000000 

8 

1.25000000000000 

1.28125000000000 

0 . 03125000000000 

9 

1.25000000000000 

1.26562500000000 

0 . 01562500000000 

10 

1.25781250000000 

1.26562500000000 

0 . 00781250000000 

11 

1.25781250000000 

1.26171875000000 

0 . 00390625000000 

12 

1.25976562500000 

1.26171875000000 

0 . 00195312500000 

13 

1.25976562500000 

1.26074218750000 

0 . 00097656250000 

14 

1.25976562500000 

1.26025390625000 

0 . 00048828125000 

15 

1.25976562500000 

1.26000976562500 

0 . 00024414062500 

16 

1.25988769531250 

1.26000976562500 

0 . 00012207031250 

17 

1.25988769531250 

1.25994873046875 

0 . 00006103515625 

18 

1.25991821289063 

1.25994873046875 

0 . 00003051757813 
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Table  8.2  Newton’s  method 
for  calculating  the  third  root 
of  2 


n 

xn 

Error 

1 

2 . 00000000000000 

0.74007895010513 

2 

1.50000000000000 

0.24007895010513 

3 

1.29629629629630 

0.03637524640142 

4 

1.26093222474175 

0.00101117484688 

5 

1.25992186056593 

0.00000081067105 

6 

1.25992104989539 

0.00000000000052 

7 

1.25992104989487 

0.00000000000000 

The  error  curves  for  the  bisection  method  and  Newton’s  method  can  be  seen  in 
Fig.  8.6.  A  semi-logarithmic  representation  (MATLAB  command  semi  logy)  is  used 
there. 

Remark  8.16  The  convergence  behaviour  of  Newton’s  method  depends  on  the  con¬ 
ditions  of  Proposition  8. 14.  If  the  starting  value  x\  is  too  far  away  from  the  zero  £, 
then  the  method  might  diverge,  oscillate  or  converge  to  a  different  zero.  If  /'(£)  =  0, 
which  means  the  zero  £  has  a  multiplicity  >  1,  then  the  order  of  convergence  may 
be  reduced. 


Experiment  8.17  Open  the  applet  Newton’s  method  and  test — using  the  sine 
function — how  the  choice  of  the  starting  value  influences  the  result  (in  the  applet  the 
right  interval  boundary  is  the  initial  value).  Experiment  with  the  intervals  [— 2,  jco] 
for  vo  =  1,  1.1,  1.2,  1.3,  1.5,  1.57,  1.5707,  1.57079  and  interpret  your  observations. 
Also  carry  out  the  calculations  with  the  same  starting  values  with  the  help  of  the 
M-filemat0  8  2.m. 


Experiment  8.18  With  the  help  of  the  applet  Newton ’s  method ,  study  how  the  order 
of  convergence  drops  for  multiple  zeros.  For  this  purpose,  use  the  two  polynomial 
functions  given  in  the  applet. 

Remark  8.19  Variants  of  Newton’s  method  can  be  obtained  by  evaluating  the  deriva¬ 
tive  fr{xn)  numerically.  For  example,  the  approximation 


f\xn)  * 


f(xn) 


f(xn- 1) 


Xfi  %n  —  1 


8.2  Newton's  Method 
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Fig.  8.6  Error  of  the 
bisection  method  and  of 
Newton’s  method  for  the 
calculation  of  y/2 


error 


n 


provides  the  secant  method 


_  (■ %n  %n—\)f  ( *Vz) 

X>1+1  ^  f(Xn)~  f(xn- 1)  ’ 

which  computes  xn+\  as  intercept  of  the  secant  through  (xn,  f(xn))  and  ( xn-\ , 
f(xn- 0)  with  the  v-axis.  It  has  a  fractional  order  less  than  2. 


8.3  Regression  Line  Through  the  Origin 

This  section  is  a  first  digression  into  data  analysis:  Given  a  collection  of  data  points 
scattered  in  the  plane,  find  the  line  of  best  fit  ( regression  line)  through  the  origin.  We 
will  discuss  this  problem  as  an  application  of  differentiation;  it  can  also  be  solved  by 
using  methods  of  linear  algebra.  The  general  problem  of  multiple  linear  regression 
will  be  dealt  with  in  Chap.  18. 

In  the  year  2002,  the  height  v  [cm]  and  the  weight  y  [kg]  of  70  students  in 
Computer  Science  at  the  University  of  Innsbruck  were  collected.  The  data  can  be 
obtained  from  the  M-file  mat  0  8_3  .  m. 

The  measurements  (xi ,  yt ) ,  i  =  1 ,  . . . ,  n  of  height  and  weight  form  a  scatter  plot 
in  the  plane  as  shown  in  Fig.  8.7.  Under  the  assumption  that  there  is  a  linear  relation 
of  the  form  y  =  kx  between  height  and  weight,  k  should  be  determined  such  that  the 
straight  line  y  =  kx  represents  the  scatter  plot  as  closely  as  possible  (Fig.  8.8).  The 
approach  that  we  discuss  below  goes  back  to  Gauss*  and  understands  the  data  fit  in 
the  sense  of  minimising  the  sum  of  squares  of  the  errors. 


2C.F.  Gauss,  1777-1855. 
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Fig.  8.7  Scatter  plot 
height/weight 


Fig.  8.8  Line  of  best  fit 

y  =  kx 


Application  8.20  (Line  of  best  fit  through  the  origin)  A  straight  line  through  the 
origin 

y  =kx 

is  to  be  fitted  to  a  scatter  plot  (x* ,  yt),  i  =  1,  . . . ,  n.  If  k  is  known,  one  can  compute 
the  square  of  the  deviation  of  the  measurement  yi  from  the  value  kxi  given  by  the 
equation  of  the  straight  line  as 

(yi  ~  kxi )2 

(the  square  of  the  error).  We  are  looking  for  the  specific  k  which  minimises  the  sum 
of  squares  of  the  errors;  thus 

n 

F(k )  =  -  kxj)2  -*  min 

i  =  \ 

Obviously,  F(k)  is  a  quadratic  function  of  k.  First  we  compute  the  derivatives 

n  n 

F'(k )  =  yy^Hv,  -  kxi ),  F"(k)  =  yyix}. 

i  =  1  i  =  1 

By  setting  F'(k)  =  0we  obtain  the  formula 

n  n 

F'(k)  =  —2  Xi  yi  +  2k  xf  =  0. 

i  =  1  i  =  1 


8.3  Regression  LineThrough  the  Origin 
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Since  evidently  F "  >  0,  its  solution 


T,xiyj 

E-7 


is  the  global  minimum  and  gives  the  slope  of  the  line  of  best  fit. 


Example  8.21  To  illustrate  the  regression  line  through  the  origin  we  use  the  Austrian 
consumer  price  index  2010-2016  (data  taken  from  [26]): 


year 

2010 

2011 

2012 

2013 

2014 

2015 

2016 

index 

100.0 

103.3 

105.8 

107.9 

109.7 

110.7 

111.7 

For  the  calculation  it  is  useful  to  introduce  new  variables  v  and  y,  where  v  =  0 
corresponds  to  the  year  2010  and  y  =  0  to  the  index  100.  This  means  that  x  = 
(year  —  2010)  and  y  =  (index  —  100);  y  describes  the  relative  price  increase  (in  per 
cent)  with  respect  to  the  year  2010.  The  re-scaled  data  are 


Xi 

0 

1 

2 

3 

4 

5 

6 

yi 

0.0 

3.3 

5.8 

7.9 

9.7 

10.7 

11.7 

We  are  looking  for  the  line  of  best  fit  to  these  data  through  the  origin.  For  this  purpose 
we  have  to  minimise 

Fik)  =  (3.3  -  k  •  l)2  +  (5.8  -  k  ■  2)2  +  (7.9  -k-  3)2  +  (9.7  -  k  ■  4)2 
-I-  (10.7  -k-5)2  +  (11.7  -k-6)2 

which  results  in  (rounded) 

1  •  3.3  +  2  -  5.8  +  3  •  7.9  +  4  •  9.7  +  5  •  10.7  +  6-11.7  201.1 

k  =  - = - =  2.21. 

1-1+2-2  +  3-3  +  4-4  +  5-5  +  6-6  91 

The  line  of  best  fit  is  thus 

y  —  2.21v 

or  transformed  back 


index  =  100  +  (year  —  2010)  •  2.21. 

The  result  is  shown  in  Fig.  8.9,  in  a  year/index- scale  as  well  as  in  the  transformed 
variables.  For  the  year  2017,  extrapolation  along  the  regression  line  would  forecast 


index(2017)  =  100  +  7  •  2.21  =  115.5. 


118 


8  Applications  of  the  Derivative 


Fig.  8.9  Consumer  price 
index  and  regression  line 


The  actual  consumer  price  index  in  2017  had  the  value  1 14.0.  Inspection  of  Fig.  8.9 
shows  that  the  consumer  price  index  stopped  growing  linearly  around  2014;  thus  the 
straight  line  is  a  bad  fit  to  the  data  in  the  period  under  consideration.  How  to  choose 
better  regression  models  will  be  discussed  in  Chap.  18. 


8.4  Exercises 

1.  Find  out  which  of  the  following  (continuous)  functions  are  differentiable  at 
v  =  0: 


y  =  x\x\;  y  =  \x\1^2,  y  =  \x\3^2. 


y  =  x  sin(l/v). 


2.  Find  all  maxima  and  minima  of  the  functions 


/(*)  = 


X2  ±  1 


and  g(x)  =  x  e 


2—x1 


3.  Find  the  maxima  of  the  functions 


y  = 


—  _e  (log*)  /2?  x  >  0  and  y  =  e  xe  \ 
x 


X  G  R. 


These  functions  represent  the  densities  of  the  standard  lognormal  distribution 
and  of  the  Gumbel  distribution,  respectively. 

4.  Find  all  maxima  and  minima  of  the  function 


fix)  = 


x 


Vv4  +  1 


determine  on  what  intervals  it  is  increasing  or  decreasing,  analyse  its  behaviour 
as  v  ->  ztoo,  and  sketch  its  graph. 


8.4  Exercises 
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Fig.  8.10  Failure  wedge 
with  sliding  surface 


5.  Find  the  proportions  of  the  cylinder  which  has  the  smallest  surface  area  F  for  a 
given  volume  V. 

Hint.  F  =  Irirh  +  2r27r  ->  min.  Calculate  the  height  h  as  a  function  of  the 
radius  r  from  V  =  r2nh,  substitute  and  minimise  F(r). 

6.  (From  mechanics  of  solids)  The  moment  of  inertia  with  respect  to  the  central 
axis  of  a  beam  with  rectangular  cross  section  is  I  =  j^bh2  (b  the  width,  h  the 
height).  Find  the  proportions  of  the  beam  which  can  be  cut  from  a  log  with 
circular  cross  section  of  given  radius  r  such  that  its  moment  of  inertia  becomes 
maximal. 

Hint.  Write  b  as  function  of  h ,  1(h)  ->  max. 

7.  (From  soil  mechanics)  The  mobilised  cohesion  cm(9)  of  a  failure  wedge  with 
sliding  surface,  inclined  by  an  angle  6,  is 

-yh  sin(6>  -  ipm)  cos  9 

Cm(9)  =  - - - • 

2  cos 

Here  h  is  the  height  of  the  failure  wedge,  cpm  the  angle  of  internal  friction,  7  the 
specific  weight  of  the  soil  (see  Fig.  8.10).  Show  that  the  mobilised  cohesion  cm 
with  given  h ,  cpm ,  7  is  a  maximum  for  the  angle  of  inclination  0  =  (pm/ 2  +  45° . 

8.  This  exercise  aims  at  investigating  the  convergence  of  Newton’s  method  for 
solving  the  equations 


v3  —  3x2  +  3v  —  1  =  0, 
v3  —  3x2  +  3x  —  2  =  0 


on  the  interval  [0,  3]. 

(a)  Open  the  applet  Newton’s  method  and  carry  out  Newton’s  method  for  both 
equations  with  an  accuracy  of  0.0001.  Explain  why  you  need  a  different 
number  of  iterations. 

(b)  With  the  help  of  the  M-file  mat  0  8_1 .  m,  generate  a  list  of  approximations 
in  each  case  (starting  value  xl  =  1.5,  tol  =  100*eps ,  maxk  = 
10  0)  and  plot  the  errors  \xn  —  £|  in  each  case  using  semi  logy.  Discuss 
the  results. 

9.  Apply  the  MATLAB  program  mat08_2  .m  to  the  functions  which  are  defined 
by  the  M-files  mat0  8_fl  .m  and  mat0  8_f2  .m  (with  respective  derivatives 
mat08_dfl.m  and  mat08_df  2  .m).  Choose  xl  =  2 ,  maxk  =  250. 
How  do  you  explain  the  results? 
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10.  Rewrite  the  MATLAB  program  mat0  8_2  .m  so  that  termination  occurs  when 
either  the  given  number  of  iterations  maxk  or  a  given  error  bound  to  1  is  reached 
(termination  at  the  nt\\  iteration,  if  either  n  >  maxk  or  \f(xn)\  <  tol).  Com¬ 
pute  n ,  xn  and  the  error  \f(xn)\.  Test  your  program  using  the  functions  from 
Exercise  8  and  explain  the  results. 

Hint.  Consult  the  M-file  mat  0  8_ex9  .  m. 

11.  Write  a  MATLAB  program  which  carries  out  the  secant  method  for  cubic  polyno¬ 
mials. 

12.  (a)  By  minimising  the  sum  of  squares  of  the  errors,  derive  a  formula  for 

the  coefficient  c  of  the  regression  parabola  y  =  cx2  through  the  data 
...,  (xn,  yn). 

(b)  A  series  of  measurements  of  braking  distances  s  [m]  (without  taking  into 
account  the  perception-reaction  distance)  of  a  certain  type  of  car  in  depen¬ 
dence  on  the  velocity  v  [km/h]  produced  the  following  values: 


Vi 

10 

20 

40 

50 

60 

70 

80 

100 

120 

Si 

1 

3 

8 

13 

18 

23 

31 

47 

63 

Calculate  the  coefficient  c  of  the  regression  parabola  s  =  cv 2  and  plot  the  result. 

13.  Show  that  the  best  horizontal  straight  line  y  =  d  through  the  data  points 
(xi ,  yi),  i  =  1,  . . . ,  n  is  given  by  the  arithmetic  mean  of  the  y -values: 


Hint.  Minimise  G(d)  =  X^=i()T  ~  d)2. 

14.  (From  geotechnics)  The  angle  of  internal  friction  of  a  soil  specimen  can  be 
obtained  by  means  of  a  direct  shear  test,  whereby  the  material  is  subjected  to 
normal  stress  a  and  the  lateral  shear  stress  r  at  failure  is  recorded.  In  case 
the  cohesion  is  negligible,  the  relation  between  r  and  a  can  be  modelled  by  a 
regression  line  through  the  origin  of  the  form  r  =  kcr.  The  slope  of  the  regression 
line  is  interpreted  as  the  tangent  of  the  friction  angle  cp,k  =  tan  cp.  In  a  laboratory 
experiment,  the  following  data  have  been  obtained  for  a  specimen  of  glacial  till 
(data  from  [25]): 


n  [kPa] 

100 

150 

200 

300 

150 

250 

300 

100 

150 

250 

100 

150 

200 

250 

n  [kPa] 

68 

127 

135 

206 

127 

148 

197 

76 

78 

168 

123 

97 

124 

157 

Calculate  the  angle  of  internal  friction  of  the  specimen. 

15.  (a)  Convince  yourself  by  applying  the  mean  value  theorem  that  the  function 
f{x)  =  cos  x  is  a  contraction  (see  Definition C.  17)  on  the  interval  [0,  1] 
and  compute  th z  fixed  point  v*  =  cos  v*  up  to  two  decimal  places  using  the 
iteration  of  Proposition  C.  1 8. 
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(b)  Write  a  MATLAB  program  which  carries  out  the  first  N  iterations  for  the 
computation  of  x*  =  cos  x*  for  a  given  initial  value  x\  e  [0,  1]  and  displays 
x\,  X2,  •  •  • ,  xn  in  a  column. 


Check  for 
updates 


Fractals  and  L-systems 
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In  geometry  objects  are  often  defined  by  explicit  rules  and  transformations  which 
can  easily  be  translated  into  mathematical  formulas.  For  example,  a  circle  is  the  set 
of  all  points  which  are  at  a  fixed  distance  r  from  a  centre  (a,  b): 


K  =  {(v,  y)  e  M2  ;  (x  —  a)2  +  (y  —  b )2  =  r2} 


or 


In  contrast  to  that,  the  objects  of  fractal  geometry  are  usually  given  by  a  recur¬ 
sion.  These  fractal  sets  (fractals)  have  recently  found  many  interesting  applications, 
e.g.  in  computer  graphics  (modelling  of  clouds,  plants,  trees,  landscapes),  in  image 
compression  and  data  analysis.  Furthermore  fractals  have  a  certain  importance  in 
modelling  growth  processes. 

Typical  properties  of  fractals  are  often  their  non-integer  dimension  and  the  self- 
similarity  of  the  entire  set  with  its  pieces.  The  latter  can  frequently  be  found  in  nature, 
e.g.  in  geology.  There  it  is  often  difficult  to  decide  from  a  photograph  without  a  given 
scale  whether  the  object  in  question  is  a  grain  of  sand,  a  pebble  or  a  large  piece  of 
rock.  For  that  reason  fractal  geometry  is  often  exuberantly  called  the  geometry  of 
nature. 

In  this  chapter  we  exemplarily  have  a  look  at  fractals  in  Mr  and  C.  Furthermore 
we  give  a  short  introduction  to  L-systems  and  discuss,  as  an  application,  a  simple 
concept  for  modelling  the  growth  of  plants.  For  a  more  in-depth  presentation  we 
refer  to  the  textbooks  [21,22]. 
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9  Fractals  and  L-systems 


9.1  Fractals 


To  start  with  we  generalise  the  notions  of  open  and  closed  interval  to  subsets  of  Mr. 
For  a  fixed  a  =  (a ,  b)  G  M2  and  e  >  0  the  set 


B( a,  s)  = 


(x,  y)  g!2  ;  yj (x  —  a)2  +  (y  —  b )2  <  e 


is  called  an  e-neighbourhood  of  a.  Note  that  the  set  B(a,  e)  is  a  circular  disc  (with 

centre  a  and  radius  e)  where  the  boundary  is  missing. 

Definition  9.1  Let  A  c  M2. 

(a)  A  point  a  e  A  is  called  interior  point  of  A  if  there  exists  an  ^-neighbourhood 
of  a  which  itself  is  contained  in  A. 

(b)  A  is  called  open  if  each  point  of  A  is  an  interior  point. 

(c)  A  point  c  G  M2  is  called  boundary  point  of  A  if  every  ^-neighbourhood  of  c 
contains  at  least  one  point  of  A  as  well  as  a  point  of  M2  \  A.  The  set  of  boundary 
points  of  A  is  denoted  by  d A  ( boundary  of  A). 

(d)  A  set  is  called  closed  if  it  contains  all  its  boundary  points. 

(e)  A  is  called  bounded  if  there  is  a  number  r  >  0  with  A  c  Z?(0,  r). 


Example  9.2  The  square 

Q  =  {(jc,  y)  g  M2  ;  0  <  x  <  1  and  0  <  y  <  1} 

is  open  since  every  point  of  Q  has  an  ^-neighbourhood  which  is  contained  in  Q ,  see 
Fig.  9.1,  left  picture.  The  boundary  of  Q  consists  of  four  line  segments 

{0,  1}  x  [0,  1]  U  [0,  1]  x  {0,  1}. 


i - 1 


i _ i 


(1,1) 


(0,0)  ^ 


Fig.  9.1  Open  (left),  closed  (middle)  and  neither  open  nor  closed  (right)  square  with  side  length  1 
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Fig.  9.2  Covering  a  curve 
using  circles 


Every  ^-neighbourhood  of  a  boundary  point  also  contains  points  which  are  outside 
of  Q ,  see  Fig.  9.1,  middle  picture.  The  square  in  Fig.  9.1,  right  picture, 

{(v,  y)  e  M2  ;  0  <  v  <  1  and  0  <  y  <  1} 

is  neither  closed  nor  open  since  the  boundary  point  (x,  y)  =  (0,  0)  is  not  an  element 
of  the  set  and  the  set  on  the  other  hand  contains  the  point  (x,  y)  =  (1,  1)  which  is 
not  an  inner  point.  All  three  sets  are  bounded  since  they  are,  for  example,  contained 
in  5(0,  2). 

Fractal  dimension.  Roughly  speaking,  points  have  dimension  0,  line  segments 
dimension  1  and  plane  regions  dimension  2.  The  concept  of  fractal  dimension  serves 
to  make  finer  distinctions.  If,  for  example,  a  curve  fills  a  plane  region  densely  one 
tends  to  assign  to  it  a  higher  dimension  than  1.  Conversely,  if  a  line  segment  has 
many  gaps,  its  dimension  could  be  between  0  and  1. 

Fet  A  c  Mr  be  bounded  (and  not  empty)  and  let  N (A,  e)  be  the  smallest  number 
of  closed  circles  with  radius  s  which  are  needed  to  cover  A,  see  Fig.  9.2. 

The  following  intuitive  idea  stands  behind  the  definition  of  the  fractal  dimension 
d  of  A:  For  curve  segments  the  number  N(A,e)  is  inverse  proportional  to  s,  for 
plane  regions  inverse  proportional  to  s2,  so 

N(A,e)  C  -e~d, 

where  d  denotes  the  dimension.  Taking  logarithms  one  obtains 

logA(A,s)  ~  log  C  —  d  logs, 


and 

log  N(A,  e)  -  logC 

cl  ^ - 

loge 


respectively.  This  approximation  is  getting  more  precise  the  smaller  one  chooses 
e  >  0.  Due  to 


£ 


lim 


log  C 


0+  logs 


this  leads  to  the  following  definition. 
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Fig.  9.3  Raster  of  the  plane  using  squares  of  side  length  e.  The  boxes  that  have  a  non-empty 
intersection  with  the  fractal  are  coloured  in  grey.  In  the  picture  we  have  N (A,  e)  =21 


Definition  9.3  Let  A  c  M2  be  not  empty,  bounded  and  N  (A ,  s)  as  above.  If  the  limit 

,  ....  log  N(A,e) 

d  =  d(A)  =  —  lim  - 

e^0+  log  £ 

exists,  then  d  is  called  fractal  dimension  of  A. 

Remark  9.4  In  the  above  definition  it  is  sufficient  to  choose  a  zero  sequence  of  the 
form 


en  =  C  -  qn ,  0  <  q  <  1 

for  6.  Furthermore  it  is  not  essential  to  use  circular  discs  for  the  covering.  One  can  just 
as  well  use  squares,  see  [5,  Chap.  5].  Hence  the  number  obtained  by  Definition  9.3 
is  also  called  box-dimension  of  A. 

Experimentally  the  dimension  of  a  fractal  can  be  determined  in  the  following 
way:  For  various  rasters  of  the  plane  with  mesh  size  en  one  counts  the  number  of 
boxes  which  have  a  non-empty  intersection  with  the  fractal,  see  Fig.  9.3.  Let  us  call 
this  number  again  N(A,en).  If  one  plots  log  N(A,  en)  as  a  function  of  1  ogen  in  a 
double-logarithmic  diagram  and  fits  the  best  line  to  this  graph  (Sect.  18.1),  then 

d(A)  &  —  slope  of  the  straight  line. 

With  this  procedure  one  can,  for  example,  determine  the  fractal  dimension  of  the 
coastline  of  Great  Britain,  see  Exercise  1 . 

Example  9.5  The  line  segment  (Fig.  9.4) 

A  =  {(x,  y)  e  M2  ;  a  <  x  <  b,  y  =  c] 
has  fractal  dimension  d  =  1 . 


Fig.  9.4  Covering  of  a 
straight  line  segment  using 
circles 


a 


b 
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Fig.  9.5  A  set  of  points  with  box-dimension  d  =  j 


We  choose 


£n  =  (b-a)'2~n,  q  =  1/2. 

Due  to  N (A,  en)  =  2n~l  the  following  holds 

\ogN(A,£n )  (n  —  1)  log  2 

- = - >  1  as  n  oo. 

1  ogsn  log(Z?  —  a)  —  n  log  2 

Likewise,  it  can  easily  be  shown:  Every  set  that  consists  of  finitely  many  points 
has  fractal  dimension  0.  Plane  regions  in  Mr  have  fractal  dimension  2.  The  fractal 
dimension  is  in  this  way  a  generalisation  of  the  intuitive  notion  of  dimension.  Still, 
caution  is  advisable  here  as  can  be  seen  in  the  following  example. 

Example  9.6  The  set  F  =  {0,  1,  . . .}  displayed  in  Fig.  9.5  has  box-dimen¬ 

sion  d  =  1/2.  We  check  this  claim  with  the  following  MATLAB  experiment. 

Experiment  9.7  To  determine  the  dimension  of  F  approximately  with  the  help 
of  MATLAB  we  take  the  following  steps.  For  y  =  1,2,3,...  we  split  the  interval 
[0,  1]  into  47  equally  large  subintervals,  set  ej  =  and  determine  the  number 
Nj  =  N(F,  £j)  of  subintervals  which  have  a  non-empty  intersection  with  F.  Then 
we  plot  log  Nj  as  a  function  of  log  e  j  in  a  double-logarithmic  diagram.  The  slope  of 
the  secant 

d  _  logAfy+1  -  log  Af j 
1  log£;  +  l-log  £j 

is  an  approximation  to  d  which  is  steadily  improving  with  growing  j .  The  values 
obtained  by  using  the  program  mat  0  9_1 .  m  are  given  in  the  following  table: 


4  J 

4 

16 

64 

256 

1024 

4096 

16384 

65536 

262144 

1048576 

dj 

0.79 

0.61 

0.55 

0.52 

0.512 

0.5057 

0.5028 

0.5014 

0.5007 

0.50035 

Verify  the  given  values  and  determine  that  the  approximations  given  by  Definition  9.3 


log  Nj 

l°g  £J 


are  much  worse.  Explain  this  behaviour. 


128 


9  Fractals  and  L-systems 


Fig.  9.6  The  construction  of  Aq 

the  Cantor  set 

Ai  - 


Example  9.8  (Cantor  set)  We  construct  this  set  recursively  using 


A0  =  [0,  1] 

=  [0,  |]  U  [|,  i] 

A2  =  [0,  I]U[|,1]U[|,|]U[|,1] 


One  obtains  An+\  from  An  by  removing  the  middle  third  of  each  line  segment  of 
An ,  see  Fig.  9.6. 

The  intersection  of  all  these  sets 


oo 

A  =  p|  An 

/?—(') 


is  called  Cantor  set.  Let  |  An  |  denote  the  length  of  An .  Obviously  the  following  holds 
true:  |Aq|  =  1,  \  A\  \  =  2/3,  | A2 1  =  (2/3)2  and  \An  \  =  (2/2)n .  Thus 


|  A  |  =  lim  \An  |  =  lim  (2/3)"  =  0, 

n^o o  n^o o 


which  means  that  A  has  length  0.  Nevertheless,  A  does  not  simply  consist  of  discrete 
points.  More  information  about  the  structure  of  A  is  given  by  its  fractal  dimension 
d.  To  determine  it  we  choose 


£71 


i.e.  q  =  1/3, 


and  obtain  (according  to  Fig.  9.6)  the  value  N(A,  en)  =  2n .  Thus 


d  =  —  lim 


log  2n  n  log  2  log  2 

_  _ ^ - =  lim  - ^ =  0.6309. 

n^oo  log  3  n—  log  2  n^oo  n  log  3  +  log  2  log  3 


The  Cantor  set  is  thus  an  object  between  points  and  straight  lines.  The  self-similarity 
of  A  is  also  noteworthy.  Enlarging  certain  parts  of  A  results  in  copies  of  A.  This 
together  with  the  non-integer  dimension  is  a  typical  property  of  fractals. 


9.1  Fractals 
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Fig.  9.7  Snowflakes  of  depth  0,  1,  2,  3  and  4 

Fig.  9.8  Law  of  formation 
of  the  snowflake 


Example  9.9  (Koch’s  snowflake  )  This  is  a  figure  of  finite  area  whose  boundary 
is  a  fractal  of  infinite  length.  In  Fig.  9.7  one  can  see  the  first  five  construction  steps 
of  this  fractal.  In  the  step  from  An  to  An+i  we  substitute  each  straight  boundary 
segment  by  four  line  segments  in  the  following  way:  We  replace  the  central  third  by 
two  sides  of  an  equilateral  triangle,  see  Fig.  9.8. 

The  perimeter  Un  of  the  figure  An  is  computed  as 


n 

U0  =  3  a 


n 


Hence  the  perimeter  U0 0  of  Koch’s  snowflake  A^  is 


Uoo  =  lim  Un  =  oo. 

n^oo 

Next  we  compute  the  fractal  dimension  of  dA0 c.  For  that  we  set 

a 

sn  =  -  •  3  ,  i.e.  q  =  1/3. 

2 

Since  one  can  use  a  circle  of  radius  en  to  cover  each  straight  boundary  piece,  we 
obtain 


N(dAoo,  en)  <  3  •  4” 


and  hence 

^  log  4 

d  =  d(0Ao o)  <  «  1.262. 

log  3 

A  covering  using  equilateral  triangles  of  side  length  en  shows  that  N (<) A en)  is 
proportional  to  4"  and  thus 


log  4 
log  3' 


The  boundary  of  the  snowflake  is  hence  a  geometric  object  between  a  curve 
and  a  plane  region. 


‘H.  von  Koch,  1870-1924. 
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9.2  Mandelbrot  Sets 

An  interesting  class  of  fractals  can  be  obtained  with  the  help  of  iteration  methods. 
As  an  example  we  consider  in  C  the  iteration 

Zn+ 1  =  +  C. 

Setting  z  =  x  +  iy  and  c  =  a  +  lb  one  obtains,  by  separating  the  real  and  the  imag¬ 
inary  part,  the  equivalent  real  form  of  the  iteration 

Xn+ 1  =xl~  yl  +  a, 
yn+ 1  =  2  xnyn  +b. 

The  real  representation  is  important  when  working  with  a  programming  language 
that  does  not  support  complex  arithmetic. 

First  we  investigate  for  which  values  of  c  e  C  the  iteration 

Z„+l  =  ll+C,  ZO  =  0 

remains  hounded.  In  the  present  case  this  is  equivalent  to  \zn\  A  oo  for  n  —>  oo. 
The  set  of  all  c  with  this  property  is  obviously  not  empty  since  it  contains  c  =  0.  On 
the  other  hand  it  is  bounded  since  the  iteration  always  diverges  for  \c\  >  2  as  can 
easily  be  verified  with  MATLAB . 

Definition  9.10  The  set 


M  =  {c  e  C  ;  \zn  I  A  00  as  n  °°} 
is  called  Mandelbrot  set 1  of  the  iteration  zn+ 1  =  z„  +  c,  zo  =  0. 

To  get  an  impression  of  M  we  carry  out  a  numerical  experiment  in  MATLAB. 

Experiment  9.11  To  visualise  the  Mandelbrot  set  M  one  first  chooses  a  raster  of  a 
certain  region,  for  example 


—2  <  Rec  <  1,  —1.15  <  Imc  <  1.15. 

Next  for  each  point  of  the  raster  one  carries  out  a  large  number  of  iterations  (e.g.  80) 
and  decides  then  whether  the  iterations  remain  bounded  (for  example  \zn  I  <  2).  If 
this  is  the  case  one  colours  the  point  in  black.  This  way  one  successively  obtains 
a  picture  of  M.  For  your  experiments  use  the  MATLAB  program  mat09_2  .m  and 
modify  it  as  required.  This  way  generate  in  particular  the  pictures  in  Fig.  9.9  in  high 
resolution. 


2B.  Mandelbrot,  1924-2010. 
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Fig.  9.9  The  Mandelbrot  set  of  the  iteration  zn+ 1  =  z3  +  c,  zo  =  0  and  enlargement  of  a  section 


Figure  9.9  shows  as  result  a  little  apple  man  which  has  smaller  apple  men  attached 
which  finally  develop  into  an  antenna.  Here  one  already  recognises  the  self- similarity. 
If  an  enlargement  of  a  certain  detail  on  the  antenna  (—1.8  <  Rec  <  —1.72,  —0.03  < 
Im  c  <  0.03)  is  made,  one  finds  an  almost  perfect  copy  of  the  complete  apple  man. 
The  Mandelbrot  set  is  one  of  the  most  popular  fractals  and  one  of  the  most  complex 
mathematical  objects  which  can  be  visualised. 


9.3  Julia  Sets 

Again  we  consider  the  iteration 


Zn+ 1  —  ~\~  C. 

This  time,  however,  we  interchange  the  roles  of  zo  and  c. 

Definition  9.12  For  a  given  ceC,  the  set 

Jc  =  {zo  €  C  ;  \zn  I  A  00  as  n  ->  oo} 
is  called  Julia  set  of  the  iteration  z^+i  =  z„  +  c. 

The  Julia  set  for  the  parameter  value  c  hence  consists  of  those  initial  values  for 
which  the  iteration  remains  bounded.  For  some  values  of  c  the  pictures  of  Jc  are 
displayed  in  Fig.  9.10.  Julia  sets  have  many  interesting  properties;  for  example, 


3G.  Julia,  1893-1978. 


Jc  is  connected  ^  cg  M. 
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Fig.  9.10  Julia  sets  of  the  iteration  zn+ 1  =  z„  +  c  for  the  parameter  values  c  =  —0.75  (top  left), 
c  =  0.35  +  0.35  i  (top  right),  c  =  —0.03  +  0.655  i  (bottom  left)  and  —0. 12  +  0.74  i  (bottom  right) 


Thus  one  can  alternatively  define  the  Mandelbrot  set  M  as 

M  =  {c  e  C  ;  Jc  is  connected} . 

Furthermore  the  boundary  of  a  Julia  set  is  self- similar  and  a  fractal. 

Experiment  9.13  Using  the  MATLAB  program  mat  0  9_3  .  m  plot  the  Julia  sets  Jc  in 
Fig.  9.10  in  high  definition.  Also  try  other  values  of  c. 


9.4  Newton's  Method  in  C 

Since  the  arithmetic  in  C  is  an  extension  of  that  in  R,  many  concepts  of  real  analysis 
can  be  transferred  directly  to  C.  For  example,  a  function  /:  C  ^  C:  m  f(z)  is 
called  complex  differentiable  if  the  difference  quotient 

f(z  +  Az)  -  f(z) 


Az 


9.4  Newton's  Method  in  C 
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has  a  limit  as  Az  ->  0.  This  limit  is  again  denoted  by 


lim 

Az^O 


f  (z  +  Az) 
Az 


m 


and  called  complex  derivative  of  /  at  the  point  z.  Since  differentiation  in  C  is  defined 
in  the  same  way  as  differentiation  in  R,  the  same  differentiation  rules  hold.  In  par¬ 
ticular  any  polynomial 


/(z)  =  anzn  H - h  a\z  +  ao 

is  complex  differentiable  and  has  the  derivative 

f\z)  =  nanzn~l  H - h  a\. 

Like  the  real  derivative  (see  Sect.  7.3),  the  complex  derivative  has  an  interpretation 
as  a  linear  approximation 

/(z)  ~  /(zo)  +  /'(zo)(z  -  zo) 


for  z  close  to  zo- 

Let  /:  C  ->  C:  z  i->  f(z)  be  a  complex  differentiable  function  with  f(Q  =  0 
and  f'(Q  /  0-  In  order  to  compute  the  zero  (  of  the  function  /,  one  first  computes 
the  linear  approximation  starting  from  the  initial  value  zo,  so 


Zl  =  zo  - 


/(zo) 

/'(zo)' 


Subsequently  z  l  is  used  as  the  new  initial  value  and  the  procedure  is  iterated.  In  this 
way  one  obtains  Newton’s  method  in  C: 


_  _  f(Zn) 

Zn+l  —  Zn  j..,  .  • 

f  (Zn) 

For  initial  values  zo  close  to  (  the  procedure  converges  (as  in  R)  quadratically. 
Otherwise,  however,  the  situation  can  become  very  complicated. 

In  1983  Eckmann  [9]  investigated  Newton’s  method  for  the  function 

f(z)  =  z3  -  1  =  (z  -  l)(z2  +  Z  +  1). 


This  function  has  three  roots  in  C 


Cl  =  1, 


1 

2 


Naively  one  could  think  that  the  complex  plane  C  is  split  into  three  equally  large 
sectors  where  the  iteration  with  initial  values  in  sector  S i  converges  to  Cl,  the  ones 
in  Sz  to  C2,  etc.,  see  Fig.  9.11. 
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Fig.  9.12  Actual  regions  of  attraction  of  Newton’s  iteration  for  finding  the  roots  of  z3  —  1  and 
enlargement  of  a  part 


A  numerical  experiment,  however,  shows  that  it  is  not  that  way.  If  one  colours  the 
initial  values  according  to  their  convergence,  one  obtains  a  very  complex  picture. 
One  can  prove  (however,  not  easily  imagine)  that  at  every  point  where  two  colours 
meet,  the  third  colour  is  also  present.  The  boundaries  of  the  regions  of  attraction  are 
dominated  by  pincer-like  motifs  which  reappear  again  and  again  when  enlarging  the 
scale,  see  Fig.  9.12.  The  boundaries  of  the  regions  of  attraction  are  Julia  sets.  Again 
we  have  found  fractals. 


Experiment  9.14  Using  the  MATLAB  program  mat  0  9_4  .  m  carry  out  an  experiment. 
Ascertain  yourself  of  the  self-similarity  of  the  appearing  Julia  sets  by  producing 
suitable  enlargements  of  the  boundaries  of  the  region  of  attraction. 


9.5  L-systems 

The  formalism  of  L-systems  was  developed  by  Lindenmayer  around  1968  in  order 
to  model  the  growth  of  plants.  It  also  turned  out  that  many  fractals  can  be  created  this 
way.  In  this  section  we  give  a  brief  introduction  to  L-systems  and  discuss  a  possible 
implementation  in  maple. 


4 A.  Lindenmayer,  1925-1989. 
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Definition  9.15  An  L-system  consists  of  the  following  five  components: 

(a)  A  finite  set  B  of  symbols,  the  so-called  alphabet.  The  elements  of  B  are  called 
letters ,  and  any  string  of  letters  is  called  a  word. 

(b)  Certain  substitution  rules.  These  rules  determine  how  the  letters  of  the  current 
word  are  to  be  replaced  in  each  iteration  step. 

(c)  The  initial  word  w  e  W.  The  initial  word  is  also  called  axiom  or  seed. 

(d)  The  number  of  iteration  steps  which  one  wants  to  carry  out.  In  each  of  these 
steps,  every  letter  of  the  current  word  is  replaced  according  to  the  substitution 
rules. 

(e)  A  graphical  interpretation  of  the  word. 


Let  W  be  the  set  of  all  words  that  can  be  formed  in  the  given  L-system.  The 
substitution  rules  can  be  interpreted  as  a  mapping  from  B  to  W : 

S  :  B  — >  W  :  b  i->  5(b). 

Example  9.16  Consider  the  alphabet  B  =  {f ,  p,  m}  consisting  of  the  three  letters  f , 
p  and  m.  As  substitution  rules  for  this  alphabet  we  take 

5(f)  =  fpfmfmf  fpfpfmf,  5(p)  =  p,  5(m)  =  m 

and  consider  the  axiom  w  =  fpfpfpf.  An  application  of  the  substitution  rules 
shows  that,  after  one  substitution,  the  word  fpf  becomes  the  new  word 
fpfmfmf  fpfpfmfpfpfmfmf  fpfpfmf .  If  one  applies  the  substitution  rules  on 
the  axiom  then  one  obtains  a  new  word.  Applying  the  substitution  rules  on  that  again 
gives  a  new  word,  and  so  on.  Each  of  these  words  can  be  interpreted  as  a  polygon 
by  assigning  the  following  meaning  to  the  individual  letters: 

f  means  forward  by  one  unit; 
p  stands  for  a  rotation  of  a  radians  (plus); 
m  stands  for  a  rotation  of  —a  radians  (minus). 

Thereby  0  <  a  <  tt  is  a  chosen  angle.  One  plots  the  polygon  by  choosing  an  arbitrary 
initial  point  and  an  arbitrary  initial  direction.  Then  one  sequentially  processes  the 
letters  of  the  word  to  be  displayed  according  to  the  rules  above. 

In  maple  lists  and  the  substitution  command  subs  lend  themselves  to  the  imple¬ 
mentation  of  L-systems.  In  the  example  above  the  axiom  would  hence  be  defined  by 


a  :=  [f,p,  f,p,  f,p,  f] 
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the  substitution  rules  would  be 

a—  >  subs(f  =  (f ,  p,  f ,  m,  f ,  m,  f ,  f ,  p,  f ,  p,  f ,  m,  f),  a). 

The  letters  p  and  m  do  not  change  in  the  example,  and  they  are  fixed  points  in  the 
construction.  For  the  purpose  of  visualisation  one  can  use  polygons  in  maple,  given 
by  lists  of  points  (in  the  plane).  These  lists  can  be  plotted  easily  using  the  command 
plot. 

Construction  of  fractals.  With  the  graphical  interpretation  above  and  a  =  7t/2, 
the  axiom  fpfpfpf  is  a  square  which  is  passed  through  in  a  counterclockwise 
direction.  The  substitution  rule  converts  a  straight  line  segment  into  a  zigzag  line. 
By  an  iterative  application  of  the  substitution  rule  the  axiom  develops  into  a  fractal. 


Experiment  9.17  Using  the  maple  worksheet  mpO  9_1 .  mws  create  different  frac¬ 
tals.  Further,  try  to  understand  the  procedure  fractal  in  detail. 

Example  9.18  The  substitution  rule  for  Koch’s  curve  is 

a  - >  subs (  f  =  (  f ,  p , f ,  m ,  m , f ,  p , f) ,  a ) . 

Depending  on  which  axiom  one  uses,  one  can  build  fractal  curves  or  snowflakes 
from  that,  see  the  maple  worksheet  mp09_l .  mws. 

Simulation  of  plant  growth.  As  a  new  element  branchings  (ramifications)  are  added 
here.  Mathematically  one  can  describe  this  using  two  new  symbols: 

v  stands  for  a  ramification; 
e  stands  for  the  end  of  the  branch. 

Let  us  look,  for  example,  at  the  word 

[f ,  p,  f ,  v,  p,  p,  f ,  p,  f ,  e,  v,  m,  f ,  m,  f ,  e,  f ,  p,  f ,  v,  p,  f ,  p,  f ,  e,  f ,  m,  f  ]. 

If  one  removes  all  branchings  that  start  with  v  and  end  with  e  from  the  list  then  one 
obtains  the  stem  of  the  plant 


stem  :=  [f ,  p,  f ,  f ,  p,  f ,  f ,  m,  f]. 

After  the  second  f  in  the  stem  obviously  a  double  branching  is  taking  place  and  the 
branches  sprout 

branchl  :=  [p,  p,  f,p,  f]  and  branch2  :=  [m,  f,m,  f]. 

Further  up  the  stem  branches  again  with  the  branch  [  p ,  f ,  p ,  f  ] . 


9.5  L-systems 
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Fig.  9.1 3  Plants  created  using  the  maple  worksheet  mp09_2  .mws 


For  a  more  realistic  modelling  one  can  introduce  additional  parameters.  For  exam¬ 
ple,  asymmetry  can  be  build  in  by  rotating  by  the  positive  angle  a  at  p  and  by  the 
negative  angle  —(3  at  m.  In  the  program  mpO  9_2  .  mws  that  was  done,  see  Fig.  9.13. 

Experiment  9.19  Using  the  maple  worksheet  mp09_2  .mws  create  different  arti¬ 
ficial  plants.  Further,  try  to  understand  the  procedure  grow  in  detail. 

To  visualise  the  created  plants  one  can  use  lists  of  polygons  in  maple,  i.e.  lists  of 
points  (in  the  plane).  To  implement  the  branchings  one  conveniently  uses  a  recursive 
stack.  Whenever  one  comes  across  the  command  v  for  a  branching,  one  saves  the 
current  state  as  the  topmost  value  in  the  stack.  A  state  is  described  by  three  numbers 
(x ,  y ,  t)  where  v  and  y  denote  the  position  in  the  (x ,  y) -plane  and  t  the  angle  enclosed 
the  with  the  positive  v-axis.  Conversely  one  removes  the  topmost  state  from  the  stack 
if  one  comes  across  the  end  of  a  branch  e  and  returns  to  this  state  in  order  to  continue 
the  plot.  At  the  beginning  the  stack  is  empty  (at  the  end  it  should  be  as  well). 

Extensions.  In  the  context  of  L-systems  many  generalisations  are  possible  which 
can  make  the  emerging  structures  more  realistic.  For  example  one  could: 

(a)  Represent  the  letter  f  by  shorter  segments  as  one  moves  further  away  from  the 
root  of  the  plant.  For  that,  one  has  to  save  the  distance  from  the  root  as  a  further 
state  parameter  in  the  stack. 

(b)  Introduce  randomness  by  using  different  substitution  rules  for  one  and  the  same 
letter  and  in  each  step  choosing  one  at  random.  For  example,  the  substitution 
rules  for  random  weeds  could  be  as  such: 

f  ->  (f,v,p,  f,e,  f,v/m,  f,e,  f)  with  probability  1/3; 

f  ->  (f,v,p,f,e,f)  with  probability  1/3; 

f  ->  (f,v,m,  f,e,  f)  with  probability  1/3. 

Using  random  numbers  one  selects  the  according  rule  in  each  step. 
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Experiment  9.20  Using  the  maple  worksheet  mpO  9_3  .  mws  create  random  plants. 
Further,  try  to  understand  the  implemented  substitution  rule  in  detail. 


9.6  Exercises 

1 .  Determine  experimentally  the  fractal  dimension  of  the  coastline  of  Great  Britain. 
In  order  to  do  that,  take  a  map  of  Great  Britain  (e.g.  a  copy  from  an  atlas)  and 
raster  the  map  using  different  mesh  sizes  (e.g.  with  1  /64th,  1  /32th,  1  /16th, 
1  /8th  and  1  /4th  of  the  North-South  expansion).  Count  the  boxes  which  contain 
parts  of  the  coastline  and  display  this  number  as  a  function  of  the  mesh  size  in  a 
double-logarithmic  diagram.  Fit  the  best  line  through  these  points  and  determine 
the  fractal  dimension  in  question  from  the  slope  of  the  straight  line. 

2.  Using  the  program  mat09_3  .m  visualise  the  Julia  sets  of  zn+ 1  =  z„  +  c  for 
c  —  — 1.25  and  c  =  0.365  —  0.3  i.  Search  for  interesting  details. 

3.  Let  f(z)  =  z3  —  1  with  z  =  x  +  iy.  Use  Newton’s  method  to  solve  f(z)  =  0 
and  separate  the  real  part  and  the  imaginary  part,  i.e.  find  the  functions  g\  and 
gz  with 

%n- \-l  =  yn)i 

JVi+1  =  yn )• 

4.  Modify  the  procedure  grow  in  the  program  mpO  9_2  .  mws  by  representing  the 
letter  f  by  shorter  segments  depending  on  how  far  it  is  away  from  the  root.  With 
that  plot  the  umbel  from  Experiment  9.19  again. 

5.  Modify  the  program  mpO  9_3  .  mws  by  attributing  new  probabilities  to  the  exist¬ 
ing  substitution  rules  (or  invent  new  substitution  rules).  Use  your  modified  pro¬ 
gram  to  plot  some  plants. 

6.  Modify  the  program  mat09_3  .  m  to  visualise  the  Julia  sets  of  zn+\  =  z„k  —  c 
for  c  =  —1  and  integer  values  of  k.  Observe  how  varying  k  affects  the  shape  of 
the  Julia  set.  Try  other  values  of  c  as  well. 

7.  Modify  the  program  mat  0  9_3  .  m  to  visualise  the  Julia  sets  of 

Zn+ 1  =zl  +  (c-  1  )Zn  ~  C. 

Study  especially  the  behaviour  of  the  Julia  sets  when  c  ranges  between  0.60  and 
0.65. 
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The  derivative  of  a  function  y  =  F(x)  describes  its  local  rate  of  change,  i.e.  the 
change  Ay  of  the  y -value  with  respect  to  the  change  Ax  of  the  v -value  in  the  limit 
Ax  — >  0;  more  precisely 


fix)  =  F'(x)  = 


Ay 
lim  — 

Ax^O  Ax 


F(x  +  Ax)  —  F(x) 
lim  - 

Ax^O  Ax 


Conversely,  the  question  about  the  reconstruction  of  a  function  F  from  its  local  rate 
of  change  /  leads  to  the  notion  of  indefinite  integrals  which  comprises  the  totality 
of  all  functions  that  have  /  as  their  derivative,  the  antiderivatives  of  /.  Chapter  10 
addresses  this  notion,  its  properties,  some  basic  examples  and  applications. 

By  multiplying  the  rate  of  change  f(x  )  with  the  change  Ax  one  obtains  an  approx¬ 
imation  to  the  change  of  the  values  of  the  function  of  the  antiderivative  F  in  the 
segment  of  length  Ax : 


Ay  =  F(x  +  Ax)  -  F(x)  ^  f(x)Ax. 


Adding  up  these  local  changes  in  an  interval,  for  instance  between  x  =  a  and  x  —b 
in  steps  of  length  Ax,  gives  an  approximation  to  the  total  change  F(b)  —  F(a).  The 
limit  Ax  — >  0  (with  an  appropriate  increase  of  the  number  of  summands)  leads  to 
the  notion  of  the  definite  integral  of  /  in  the  interval  [a,  b],  which  is  the  subject  of 
Chap.  11. 


1 0.1  Indefinite  Integrals 

In  Sect.  7.2  it  was  shown  that  the  derivative  of  a  constant  is  zero.  The  following 
proposition  shows  that  the  converse  is  also  true. 
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Proposition  10.1  If  the  function  F  is  differentiable  on  ( a ,  b)  and  F'(x)  =  0  for  all 
x  6  (a,  b)  then  F  is  constant.  This  means  that  F(x)  =  c  for  a  certain  cel  and  all 
x  6  (a,  b). 

Proof  We  choose  an  arbitrary  vo  6  ( a ,  b)  and  set  c  =  F(x  o).  If  now  v  6  (a,  b)  then, 
according  to  the  mean  value  theorem  (Proposition  8.4), 

F(x)-F(x0)  =  F\  O(x-xo) 

for  a  point  £  between  v  and  xo.  Since  F'( 0  =  0  it  follows  that  F(x)  =  F(x o)  =  c. 
This  holds  for  all  v  6  (a,  b),  consequently  F  has  to  be  equal  to  the  constant  function 
with  value  c.  □ 

Definition  10.2  ( Antiderivatives )  Let  /  be  a  real-valued  function  on  an  interval 
(a,  b).  An  antiderivative  of  /  is  a  differentiable  function  F:  (a,  b)  — >  R  whose 
derivative  F ’  equals  /. 

Example  10.3  The  function  F{x)  =  is  an  antiderivative  of  f(x)=x2,  as  is 
G(x)  =  t  +  5* 

Proposition  10.1  implies  that  antiderivatives  are  unique  up  to  an  additive  constant. 

Proposition  10.4  Let  F  and  G  be  antiderivatives  of  f  in  ( a,b ).  Then  F(x)  = 
G(x)  +  c  for  a  certain  cel  and  all  x  e  (a,  b). 

Proof  Since  F'(x)  —  Gfx)  =  f(x)  —  f{x)  =  0  for  all  v  g  (a,  b),  an  application 
of  Proposition  10. 1  gives  the  desired  result.  □ 

Definition  10.5  (Indefinite  integrals)  The  indefinite  integral 

j  f{x)  Ax 

denotes  the  totality  of  all  antiderivatives  of  /. 

Once  a  particular  antiderivative  F  has  been  found,  one  writes  accordingly 

J  f(x)  dx  =  F(x)  +  c. 

Example  10.6  The  indefinite  integral  of  the  quadratic  function  (Example  10.3)  is 
j  x2  dx  =  ^  +  c. 


10.1  Indefinite  Integrals 
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Example  10.7  (a)  An  application  of  indefinite  integration  to  the  differential  equation 
of  the  vertical  throw:  Let  w(t)  denote  the  height  (in  metres  [m])  at  time  t  (in  seconds 
[s])  of  an  object  above  ground  level  ( w  =  0).  Then 

w'(t)  =  v(t) 

is  the  velocity  of  the  object  (positive  in  upward  direction)  and 

v'(t)  =  a(t) 

the  acceleration  (positive  in  upward  direction).  In  this  coordinate  system  the  gravi¬ 
tational  acceleration 

g  =  9.81  [m/s2] 

acts  downwards,  consequently 

a(t)  =  -g. 

Velocity  and  distance  are  obtained  by  inverting  the  differentiation  process 


w(t)  = 


a(t)  d t  +  c\ 
v(t)  d t  +  C2 


~gt  +  ci, 


I 


S.2 


( ~gt  +  Cl)  dt  +  C2  —  —  ~t  +  C\t  +  C2, 


where  the  constants  c\,C2  are  determined  by  the  initial  conditions: 

ci  —  u(0)  ...  initial  velocity, 

C2  =  w(0)  ...  initial  height. 

(b)  A  concrete  example — the  free  fall  from  a  height  of  100  m.  Here 

w(  0)  =  100,  v(0)  =  0 


and  thus 


1  ? 

w(t)  =  — 9.8 1?2  +  100. 
2 


The  travelled  distance  as  a  function  of  time  (Fig.  10.1)  is  given  by  a  parabola. 
The  time  of  impact  to  is  obtained  from  the  condition  w(to)  =  0,  i.e. 

0  =  -1 9.81^  +  100,  to  =  ^200/9. 81  %  4.5  [s], 


the  velocity  at  impact  is 


v(to)  = —gto  ~  44.3  [m/s]  ^  160  [km/h]. 
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Fig.  10.1  Free  fall:  travelled 
distance  as  function  of  time 


10.2  Integration  Formulas 

It  follows  immediately  from  Definition  10.5  that  indefinite  integration  can  be  seen 
as  the  inversion  of  differentiation.  It  is,  however,  only  unique  up  to  a  constant: 

(Jf(x)dx\  =  f{x), 
j g'W  dx  =  g(x)  +  c. 

With  this  consideration  and  the  formulas  from  Sect.  7.4  one  easily  obtains  the  basic 
integration  formulas  stated  in  the  following  table.  The  formulas  are  valid  in  the 
according  domains. 

The  formulas  in  Table  10.1  are  a  direct  consequence  of  those  in  Table  7.1. 

Experiment  10.8  Antiderivatives  can  be  calculated  in  maple  using  the  command 
int.  Explanations  and  further  integration  commands  can  be  found  in  the  maple 


Table  10.1  Integrals  of  some  elementary  functions 


fix) 

xa,  a  ^  -1 

1 

X 

QX 

ax 

J  fix)  dx 

1 

1  r 

log  |x|  +  c 

QX  +  C 

1  * 

nx  1  r 

.  1  c 
(X  -|-  1 

Cl  |  c 

log  a 

fix) 

sin  v 

cosx 

1 

Vl  —  x2 

1 

1  +  X2 

J  fix)  dx 

—  COS  X  +  c 

sin  x  +  c 

arc  sin  x  +  c 

arctan  x  +  c 

fix) 

sinhx 

coshx 

1 

Vl  +  X2 

1 

Vx2  —  1 

J  fix)  dx 

coshx  +  c 

sinh  x  +  c 

arsinh  x  +  c 

arcosh  x  +  c 
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worksheet  mpl0_l  .mws.  Experiment  with  these  maple  commands  by  applying 
them  to  the  examples  of  Table  10.1  and  other  functions  of  your  choice. 

Experiment  10.9  Integrate  the  following  expressions 

-X* 2  2  . 

ve  ,  e  ,  sin(v  ) 


with  maple. 

Functions  that  are  obtained  by  combining  power  functions,  exponential  functions 
and  trigonometric  functions,  as  well  as  their  inverses,  are  called  elementary  functions. 
The  derivative  of  an  elementary  function  is  again  an  elementary  function  and  can 
be  obtained  using  the  rules  from  Chap.  7.  In  contrast  to  differentiation  there  is  no 
general  procedure  for  computing  indefinite  integrals.  Not  only  does  the  calculation 
of  an  integral  often  turn  out  to  be  a  difficult  task,  but  there  are  also  many  elementary 
functions  whose  antiderivatives  are  not  elementary.  An  algorithm  to  decide  whether 
a  functions  has  an  elementary  indefinite  integral  was  first  deduced  by  Liouville 
around  1835.  This  was  the  starting  point  for  the  field  of  symbolic  integration.  For 
details,  we  refer  to  [7]. 


Example  10.10  (Higher  transcendental  functions)  Antiderivatives  of  functions  that 
do  not  possess  elementary  integrals  are  frequently  called  higher  transcendental  func¬ 
tions.  We  give  the  following  examples: 


dx  =  Erf(v)  +  c 
dx  =  Ei(x)  +  c 
dx  =  li(x)  +  c 
dx  =  Si(x)  +  c 


dx  —  S(x)  +  c 


Gaussian  error  function; 
exponential  integral; 
logarithmic  integral; 
sine  integral; 

Fresnel  integral. 


Proposition  10.11  (Rules  for  indefinite  integration)  For  indefinite  integration  the 
following  rules  hold: 

(a)  Sum :  J  ( f(x )  +  g(v))  dx  =  f  f(x)  dx  +  f  g(x)  dx 


H.  Liouville,  1809-1882. 

2A.J.  Fresnel,  1788-1827. 
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(b)  Constant  factor:  J  Xf(x)dx  =  Xf  f(x)dx  (A  e  R) 

(c)  Integration  by  parts: 


J  f(x)g'(x)  dx  =  f(x)g(x ) 


J  f(x)g(x)  dx 


(d)  Substitution : 


J  f(g(x))g'(x)dx 


/ 


f(y)dy 


y=g(x) 


Proof  (a)  and  (b)  are  clear;  (c)  follows  from  the  product  rule  for  the  derivative 
(Sect.  7.4) 


j  f(x)g'(x)  dx  + 


J  f'(x)g(x)  dx  = 


f  (x)g'(x)  +  f(x)g(x))  dx 


dx  =  f(x)g(x )  +  c, 


which  can  be  rewritten  as 

J  f(x)g'(x)  dx  =  f  (x)g(x)  -  j  f'{x)g{x)  dx. 

In  this  formula  we  can  drop  the  integration  constant  c  since  it  is  already  contained 
in  the  notion  of  indefinite  integrals,  which  appear  on  both  sides.  Point  (d)  is  an 
immediate  consequence  of  the  chain  rule  according  to  which  an  antiderivative  of 
f(g(x))g\x)  is  given  by  the  antiderivative  of  f(y)  evaluated  at  y  =  g(x).  □ 


Example  10.12  The  following  five  examples  show  how  the  rules  of  Table  10.1  and 
Proposition  10.11  can  be  applied. 


(a) 

(b) 


/IH* 

/ 


*/3  dx  = 


-l+i 


V  3 


“7  +  1 


+  c  =  ixyi+c. 


/ 


v  cos  x  dx  =  x  sinx  —  I  sin  v  dx  =  v  sin  v  +  cos  v  +  c, 


which  follows  via  integration  by  parts: 


fix)  =  x,  g\x)  =  cosx, 
f(x)  =  1,  gr(x)  =  sinx. 


x  dx  =  x  log  v 


/—  dx  =  x  log  x  —  x  -j-  c, 
x 
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via  integration  by  parts: 


fix)  =  log x ,  g'(x)  =  1, 
fix)  =  gix)  =  x. 


(d) 


fxsi 


sin(v  )  dx  =  /  -  sin  y  d y 


fl  ■ 

/  -  SI] 

J  2 


1 


? 

y=xz 


cos  y 


2 

y=xA 


1  2 

+  c  =  —  cos(v  )  +  c, 


which  follows  from  the  substitution  rule  with  y  =  g(x)  =  x2,  g\x)  =  2x, 

fiy)  =  2  sin  y- 


/ 


(e)  /  tan  x  dx  = 


-f 


sin  v 


cos  v 


dx  =  —  log  \  y\ 


y=cosx 


+  C  =  —  log  |cos  x\  +  C, 


again  after  substitution  with  y  =  g(x)  =  cosv,  g'(x)  =  —  sinv  and  f(y)  = 
-1/y. 


Example  10.13  (A  simple  expansion  into  partial  fractions)  In  order  to  find  the 
indefinite  integral  of  f(x)  =  l/(x2  —  1),  we  decompose  the  quadratic  denominator 
in  its  linear  factors  x2  —  l  =  (x  —  1)(jc  +  1)  and  expand  f(x)  into  partial  fractions 
of  the  form 

1  _  A  B 

x2  —  1  x  —  1  X  +  1 

Resolving  the  fractions  leads  to  the  equation  1  =  A(x  +  1)  +  B(x  —  1).  Equating 
coefficients  results  in 


(A  +  B)x  =  0,  A  —  B  =  1 
with  the  obvious  solution  A  =  1/2,  B  =  —1/2.  Thus 

f^— **='-((— -f —) 

J  x2  —  1  2  \J  x  —  1  J  x  -h  1 J 

=  ^(log  \x-l\-  log  \x  +  1|)  +  c 


1 

-  log 
2  5 


x  —  1 

X  1 


+  C. 


In  view  of  Example  7.30,  another  antiderivative  of  f(x)  =  l/(x2  —  1)  is  F(x) 
—  artanh  v.  Thus,  by  Proposition  10.4, 


1 

artanh  v  =  —  log 

2  5 


x  —  1 
x  ~F  1 


+  C=llog 


x  ~F  1 
x  —  \ 


+  C. 


Inserting  x  =  0  on  both  sides  shows  that  C  =  0  and  yields  an  expression  of  the 
inverse  hyperbolic  tangent  in  terms  of  the  logarithm. 
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10.3  Exercises 


1.  An  object  is  thrown  vertically  upwards  from  the  ground  with  a  velocity  of 
10  [m/s].  Find  its  height  w(t)  as  a  function  of  time  t ,  the  maximum  height 
as  well  as  the  time  of  impact  on  the  ground. 

Hint.  Integrate  w"(t)  =  —  g  ~  9.81  [m/s2]  twice  indefinitely  and  determine  the 
integration  constants  from  the  initial  conditions  w( 0)  =  0,  w'( 0)  =  10. 

2.  Compute  the  following  indefinite  integrals  by  hand  and  with  maple: 


/ 


(a)  /  (x  +  3x2  +  5x 4  +  lx6)  dx, 


/ 


(C)  /  xe  1  dx  (substitution). 


/ 

/ 


dx 


(d)  /  xeA  dx  (integration  by  parts). 


3.  Compute  the  indefinite  integrals 


/ 


(a)  cos2  x  dx, 


(b) 


/ rx 


—  x2  dx 


by  hand  and  check  the  results  using  maple. 

Hints.  For  (a)  use  the  identity 

2  1 

cos  v  =  -(1  +  cos2x); 

for  (b)  use  the  substitution  y  =  g(x)  =  arcsin  x,  f(y)  =  1  —  sin2  y. 

4.  Compute  the  indefinite  integrals 


(a) 


/ 


dx 


x2  -\~  lx  H-  5 


dx, 


(b) 


/ 


dx 


x2  +  2x  —  3 


by  hand  and  check  the  results  using  maple. 

Hints.  Write  the  denominator  in  (a)  in  the  form  (x  +  l)2  +  4  and  reduce  it  to 
y2  +  1  by  means  of  a  suitable  substitution.  Factorize  the  denominator  in  (b)  and 
follow  the  procedure  of  Example  10.13. 

5.  Compute  the  indefinite  integrals 


(a) 


/ 


dx 


x2  +  2x 


dx, 


(b) 


/ 


dx 


x2  -\~  2x  1 


by  hand  and  check  the  results  using  maple. 

6.  Compute  the  indefinite  integrals 


/ 


(a)  /  x2 sinx  dx, 


(b) 


/ 


x  e 


-3x 


dx 


Hint.  Repeated  integration  by  parts. 
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7.  Compute  the  indefinite  integrals 


/ 


eA  +  1 


J  >/ 1  +  x2  dx. 


Hint.  Substitution  y  =  eY  in  case  (a),  substitution  y  =  sinh  v  in  case  (b),  invoking 
the  formula  cosh2  y  —  sinh2  y  =  1  and  repeated  integration  by  parts  or  recourse 
to  the  definition  of  the  hyperbolic  functions. 

8.  Show  that  the  functions 


f(x)  =  arctanv 


and 


g(x)  =  arctan 


1  x 


l  —  x 


differ  in  the  interval  (— oo,  1)  by  a  constant.  Compute  this  constant.  Answer  the 
same  question  for  the  interval  (1,  oo). 

9.  Prove  the  identity  arsinh  v  =  log  (v  +  \/l  +  x2^. 

Hint.  Recall  from  Chap.  7  that  the  functions  f(x)  =  arsinhv  and  g(x)  = 
log  +  \/l  +  x2^j  have  the  same  derivative.  (Compare  with  the  algebraic 
derivation  of  the  formula  in  Exercise  15  of  Sect.  2.3.) 
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Definite  Integrals 


In  the  introduction  to  Chap.  10  the  notion  of  the  definite  integral  of  a  function  /  on  an 
interval  [a,  b]  was  already  mentioned.  It  arises  from  summing  up  expressions  of  the 
form  f(x)Ax  and  taking  limits.  Such  sums  appear  in  many  applications  including 
the  calculation  of  areas,  surface  areas  and  volumes  as  well  as  the  calculation  of 
lengths  of  curves.  This  chapter  employs  the  notion  of  Riemann  integrals  as  the  basic 
concept  of  definite  integration.  Riemann’ s  approach  provides  an  intuitive  concept  in 
many  applications,  as  will  be  elaborated  in  examples  at  the  end  of  the  chapter. 

The  main  part  of  this  chapter  is  dedicated  to  the  properties  of  the  integral.  In 
particular,  the  two  fundamental  theorems  of  calculus  are  proven.  The  first  theorem 
allows  one  to  calculate  a  definite  integral  from  the  knowledge  of  an  antiderivative. 
The  second  fundamental  theorem  states  that  the  definite  integral  of  a  function  / 
on  an  interval  [ a,x ]  with  variable  upper  bound  provides  an  antiderivative  of  /. 
Since  the  definite  integral  can  be  approximated,  for  example  by  Riemann  sums,  the 
second  fundamental  theorem  offers  a  possibility  to  approximate  the  antiderivative 
numerically.  This  is  of  importance,  for  example,  for  the  calculation  of  distribution 
functions  in  statistics. 


11.1  The  Riemann  Integral 

Example  11.1  (From  velocity  to  distance)  How  can  one  calculate  the  distance  w 
which  a  vehicle  travels  between  time  a  and  time  b  if  one  only  knows  its  velocity  v(t) 
for  all  times  a  <  t  <  b?  If  v(t)  =  v  is  constant,  one  simply  gets 

w  =  v  •  (b  —  a). 
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Fig.  11.1  Subdivision  of  the 
time  axis 


Ti  r 2  rn 

- HM — - 1 - * — I - ►  t 

a  =  to  t\  t‘2  •  •  •  tn- 1  tn  =  b 


If  the  velocity  v(t)  is  time-dependent,  one  divides  the  time  axis  into  smaller  subin¬ 
tervals  (Fig.  11.1):  a  =  to  <  t\  <  £2  <  •  •  •  <  tn  =  b. 

Choosing  intermediate  points  rj  e  [tj-\ ,  tj]  one  obtains  approximately 

v(t)  ~  v(Tj)  for  te[tj-i,tj], 

if  v  is  a  continuous  function  of  time.  The  approximation  is  the  more  precise,  the 
shorter  the  intervals  [tj- 1,  tj]  are  chosen.  The  distance  travelled  in  this  interval  is 
approximately  equal  to 

Wj  %  v(Tj)(tj-tj- 1). 

The  total  distance  covered  between  time  a  and  time  b  is  then 

n  n 

w  =  J2wi  *  J2 v(Ti'>(-tj  ~  lJ- 1)- 

;= 1  j= 1 

Letting  the  length  of  the  subintervals  [fy_i,  tend  to  zero,  one  expects  to  obtain 
the  actual  value  of  the  distance  in  the  limit. 

Example  11.2  (Area  under  the  graph  of  a  nonnegative  function)  In  a  similar  way 
one  can  try  to  approximate  the  area  under  the  graph  of  a  function  y  =  f  (pc)  by  using 
rectangles  which  are  successively  refined  (Fig.  11.2). 

The  sum  of  the  areas  of  the  rectangles 

n 

F  %  E./X/)(v/-v./  i> 

;= 1 

form  an  approximation  to  the  actual  area  under  the  graph. 


Fig.  1 1 .2  Sums  of 
rectangles  as  approximation 
to  the  area 
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The  two  examples  are  based  on  the  same  concept,  the  Riemann  integral}  which 
we  will  now  introduce.  Let  an  interval  [a,  b ]  and  a  function  /  =  [a,  b]  ->  R  be 
given.  Choosing  points 


a  =  x o  <  xi  <  X2  <  •  •  •  <  xn-\  <  xn  =  b, 

the  intervals  [xo,  xi],  [x\,  X2],  . . . ,  [xn-\,  xn]  form  a  partition  Z  of  the  interval 
[a,  b].  We  denote  the  length  of  the  largest  subinterval  by  0(Z),  i.e. 

0(Z)  =  max  \xj  —  x j~  \  \ . 

j  =  l,...,n 

For  arbitrarily  chosen  intermediate  points  e  [xj-  \ ,  xj  ]  one  calls  the  expression 

n 

s  =  'Z2f(tj)(xj-xj-i) 

7  =  1 

a  Riemann  sum.  In  order  to  further  specify  the  idea  of  the  limiting  process  above,  we 
take  a  sequence  Z\,  Z2,  Z3, . . .  of  partitions  such  that  @(Zn)  ->  0  as  N  ->  00  and 
corresponding  Riemann  sums  5^. 

Definition  11.3  A  function  /  is  called  Riemann  integrable  in  [a,  b]  if,  for  arbitrary 
sequences  of  partitions  (Zn)n>i  with  @(Zn)  — >  0,  the  corresponding  Riemann 
sums  (Sn)n>i  tend  to  the  same  limit  1(f),  independently  of  the  choice  of  the 
intermediate  points.  This  limit 


Kf)  =  f  fix)  dx 


is  called  the  definite  integral  of  /  on  [a,  b]. 


The  intuitive  approach  in  the  introductory  Examples  11.1  and  11.2  can  now  be 
made  precise.  If  the  respective  functions  /  and  v  are  Riemann  integrable,  then  the 
integral 

F  =  f  fix)  dx 

J  Cl 

represents  the  area  between  the  x-axis  and  the  graph,  and 


w  = 


v(t)  d t 


gives  the  total  distance  covered. 


1B.  Riemann,  1826-1866. 
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Experiment  11.4  Open  the  M-file  matll_l .  m,  study  the  given  explanations  and 
experiment  with  randomly  chosen  Riemann  sums  for  the  function  f(x)  =  3x2  in  the 
interval  [0,  1].  What  happens  if  you  take  more  and  more  partition  points  n ? 

Experiment  11.5  Open  the  applet  Riemann  sums  and  study  the  effects  of  changing 
the  partition.  In  particular,  vary  the  maximum  length  of  the  subintervals  and  the 
choice  of  intermediate  points.  How  does  the  sign  of  the  function  influence  the  result? 

The  following  examples  illustrate  the  notion  of  Riemann  integrability. 


Example  11.6  (a)  Let  f(x)  =  c  =  constant.  Then  the  area  under  the  graph  of  the 
function  is  the  area  of  the  rectangle  c(b  —  a).  On  the  other  hand,  any  Riemann  sum 
is  of  the  form 


/(£ l)(*l  -  *o)  +  /(&) (*2  —  *l)  H - h  f(tn)(xn  -  xn-i) 

=  c(x  1  -  Xo  +  X2  -  Xl  H - xn  -  Xn-\) 

=  c(xn  —  vo)  =  c(b  —  a). 


All  Riemann  sums  are  equal  and  thus,  as  expected, 


cdx  =  c(b  —  a). 


(b)Let  f{x)  =  -for*  e  (0,  1],  /(0)  =  0.  This  function  is  not  integrable  in  [0,  1]. 

JC 

The  corresponding  Riemann  sums  are  of  the  form 


1 


0)  +  ir(x2 
s2 


1 

■^l)  +  •  •  •  +  xr(xn 
s  n 


%n  —  1 )  • 


By  choosing  close  to  0  every  such  Riemann  sum  can  be  made  arbitrarily  large; 
thus  the  limit  of  the  Riemann  sums  does  not  exist. 

(c)  Dirichlet’s  function2 


f(x)  = 


x  e  Q 
x  £  Q 


is  not  integrable  in  [0,  1].  The  Riemann  sums  are  of  the  form 

Sn  =  /(£l)(*  1  -Xo)-\ - h  f(€n)(xn  -  Xn-i). 

If  all  g  Q  then  Sn  =  1 .  If  one  takes  all  £  Q  then  Sn  =  0;  thus  the  limit  depends 

on  the  choice  of  intermediate  points  . 


2P.G.L.  Dirichlet,  1805-1859. 
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Fig.  11.3  A  piecewise 
continuous  function 


Remark  11.7  Riemann  integrable  functions  /  :  [ a,b ]  — >  Mare  necessarily  bounded. 
This  fact  can  easily  be  shown  by  generalising  the  argument  in  Example  11.6(b). 

The  most  important  criteria  for  Riemann  integrability  are  outlined  in  the  follow¬ 
ing  proposition.  Its  proof  is  simple,  however,  it  requires  a  few  technical  consider¬ 
ations  about  refining  partitions.  For  details,  we  refer  to  the  literature,  for  instance 
[4,  Chap.  5.1]. 

Proposition  11.8  (a)  Every  function  which  is  bounded  and  monotonically  increasing 
(monotonic ally  decreasing )  on  an  interval  [a,  b ]  is  Riemann  integrable. 

(b)  Every  piecewise  continuous  function  on  an  interval  [a,  b]  is  Riemann  inte¬ 
grable.  □ 


A  function  is  called  piecewise  continuous  if  it  is  continuous  except  for  a  finite 
number  of  points.  At  these  points,  the  graph  may  have  jumps  but  is  required  to  have 
left-  and  right-hand  limits  (Fig.  11.3). 


Remark  11.9  By  taking  equidistant  grid  points  a=x  o  <  x\  <  •  •  •  <  xn-\  <  xn  =  b 
for  the  partition,  i.e. 


b  —  a 

Xj  —  Xj-\  =:  Ax  =  - 

n 

the  Riemann  sums  can  be  written  as 


sN  =  J2f(tj)Ax- 

7  =  1 


The  transition  Ax  — >  0  with  simultaneous  increase  of  the  number  of  summands 
suggests  the  notation 


f(x)  dx. 


Originally  it  was  introduced  by  Feibniz3  with  the  interpretation  as  an  infinite  sum  of 
infinitely  small  rectangles  of  width  dx.  After  centuries  of  dispute,  this  interpretation 


3G.  Leibniz,  1646-1716. 
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can  be  rigorously  justified  today  within  the  framework  of  nonstandard  analysis  (see, 
for  instance,  [27]). 

Note  that  the  integration  variable  x  in  the  definite  integral  is  a  bound  variable 
and  can  be  replaced  by  any  other  letter: 


rb  rb  rb 

/  fix)  dx=  /  f(t)dt  =  /  /(0d£  = 
J  a  J  a  J  a 


This  can  be  used  with  advantage  in  order  to  avoid  possible  confusion  with  other 
bound  variables. 

Proposition  11.10  (Properties  of  the  definite  integral)  In  the  following  let  a  <  b 
and  /,  g  be  Riemann  integrable  on  [a,  b]. 

(a)  Positivity: 


f  >  0  in  [a,  b] 


f  <  0  in  [a,  b] 


i, 

L 


b 


fix)  dx  >  0, 


b 


fix)  dx  <  0. 


(b)  Monotonicity: 


f  <  gin  [a,  b] 


rb  rb 

/  fix)  dx  <  /  g(x)  dx. 
J  a  J  a 


In  particular,  with 


m  =  inf  f(x),  M  =  sup  fix), 

xe[aM  xe[aM 


the  following  inequality  holds 


m(b  —  a)  <  f  f(x)  dx  <  M (b  —  a), 

J  Cl 


(c)  Sum  and  constant  factor  ( linearity ): 


rb  rb  rb 

/  (f(x)  +  g(x))dx=  /  f(x)  dx+  g(x)dx 

J  a  J  a  J  a 

rb  rb 

/  \f{x)dx  =  A  /  fix)  dx  (A  g  R). 

J  a  J  a 
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(d)  Partition  of  the  integration  domain:  Let  a  <  h  <  c  and  f  he  integrahle  in  [a,  c ], 
then 


fix)  dx  + 


fix)  dx  = 


fix)  dx. 


If  one  defines 


fix)  dx  =  0, 


fix)  dx  = 


fix)  dx, 


then  one  obtains  the  validity  of  the  sum  formula  even  for  arbitrary  a,  b,  c  e  R 
if  f  is  integrable  on  the  respective  intervals. 


Proof  All  justifications  are  easily  obtained  by  considering  the  corresponding  Rie¬ 
mann  sums.  □ 


Item  (a)  from  Proposition  11.10  shows  that  the  interpretation  of  the  integral  as  the 
area  under  the  graph  is  only  appropriate  if  /  >  0.  On  the  other  hand,  the  interpretation 
of  the  integral  of  a  velocity  as  travelled  distance  is  also  meaningful  for  negative 
velocities  (change  of  direction).  Item  (d)  is  especially  important  for  the  integration 
of  piecewise  continuous  functions  (see  Fig.  1 1.3):  the  integral  is  obtained  as  the  sum 
of  the  single  integrals. 


1 1 .2  Fundamental  Theorems  of  Calculus 


For  a  Riemann  integrable  function  /  we  define  a  new  function 


Fix)  -  f  fit)  At. 

J  Cl 

It  is  obtained  by  considering  the  upper  boundary  of  the  integration  domain  as  variable. 

Remark  11.11  For  positive  /,  the  value  F(x)  is  the  area  under  the  graph  of  the 
function  in  the  interval  [a,  x]\  see  Fig.  11.4. 


Fig.  11.4  The  interpretation 
of  F(x)  as  area 


►  x 
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Proposition  11.12  (Fundamental  theorems  of  calculus)  Let  f  be  continuous  in 
[a,  b].  Then  the  following  assertions  hold: 

(a)  First  fundamental  theorem:  If  G  is  an  antiderivative  off  then 

b 

f{x)  dx  =  G(b)  —  G(a). 

(b)  Second  fundamental  theorem:  The  function 

f(x)=  r  f(o  dt 

J  a 

is  an  antiderivative  of  f,  that  is,  F  is  differentiable  and  F'(x)  =  fix). 


Proof  In  the  first  step  we  prove  the  second  fundamental  theorem.  For  that  let 
x  e  (a,  b),  h  >  0  and  x  -\-h  e  (a,  b).  According  to  Proposition  6.15  the  function 
/  has  a  minimum  and  a  maximum  in  the  interval  [x,  x  +  h]: 

m(h)  =  min  fit),  M(h)  =  max  fit). 

te[x,x+h]  te[x,x+h] 


The  continuity  of  /  implies  the  convergence  m(h)  fix)  and  M(h)  fix)  as 
h  — >  0.  According  to  item  (b)  in  Proposition  11.10  we  have  that 


m(h)  •  h  <  Fix  +  h)  —  F(x)  = 


rx+h 

J  x 


f(t)  dt  <  M(h)  •  h, 


This  shows  that  F  is  differentiable  at  v  and 

,  Fix  +  h)  —  Fix) 

F'ix)  =  lim  / - —  =  fix). 

h  —>  o  h 

The  first  fundamental  theorem  follows  from  the  second  fundamental  theorem 


>b 


f  fit)dt  =  F(b)  =  F(b)-F(a), 

J  a 


since  Fia)  =  0.  If  G  is  another  antiderivative  then  G  =  F  +  c  according  to  Propo¬ 
sition  10.1;  hence 

Gib)  -  Gia)  =  Fib)  +  c  -  (F(a)  +  c)  =  Fib)  -  Fia). 


Thus  Gib)  —  Gia)  =  fix)  dx  as  well. 


□ 
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Remark  11.13  For  positive  /,  the  second  fundamental  theorem  of  calculus  has  an 
intuitive  interpretation.  The  value  F(x  +  h)  —  F(x)  is  the  area  under  the  graph  of 
the  function  y  =  f(x)  in  the  interval  [x,  x  +  h],  while  hf(x)  is  the  area  of  the 
approximating  rectangle  of  height  fix).  The  resulting  approximation 

F(x  +  h)~  Fix) 

- 7 - ^  fix ) 

h 

suggests  that  in  the  limit  as  h  — >  0,  F'(x)  =  fix).  The  given  proof  makes  the  argu¬ 
ment  rigorous. 


Applications  of  the  first  fundamental  theorem.  The  most  important  application 
consists  in  evaluating  definite  integrals  Ja  f(x)dx.  For  that,  one  determines  an 
antiderivative  Fix),  for  instance  as  indefinite  integral,  and  substitutes: 


fix)  dx  =  F(x) 


x—b 


x  —a 


—  F(b) 


F{d). 


Example  11.14  As  an  application  we  compute  the  following  integrals. 


(a)  /  x2  dx  = 


(b) 


(c) 


l 
L 
/: 


X' 


x— 3 


x=\ 


27  1  _  26 

T  _  3  “  T 


7r/2 


cos  x  dx  =  sin  x 


X  =7T /2 


7 r 


x— 0 


=  sin - sin  0  =  1 

2 


1 


9  ^  9 

v  sin(v  )  dx  =  --  cos(v  ) 


x  — 1 


x— 0 


1  , 

=  —  cos  1  —  ( 
2  v 


1  , 
-  cos  Oj 


1  1 

=  —  -  cos  1  +  2  ^see  Example  10.12). 


Remark  11.15  In  maple  the  integration  of  expressions  and  functions  is  carried  out 
using  the  command  int,  which  requires  the  analytic  expression  and  the  domain  as 
arguments,  for  instance 

int (x~ 2 ,  x  =  1 .  . 3  )  ; 


Applications  of  the  second  fundamental  theorem.  Usually,  such  applications  are 
of  theoretical  nature,  like  the  description  of  the  relation  between  travelled  distance 
and  velocity, 

w(t)  =  w( 0)  +  f  v(s)ds,  w'(t )  =  v(t), 

Jo 
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where  w  (t)  denotes  the  travelled  distance  from  0  to  time  t  and  v  (t)  is  the  instantaneous 
velocity.  Other  applications  arise  in  numerical  analysis,  for  instance 


is  an  antiderivative  of  e  x 


The  value  of  such  an  integral  can  be  approximately  calculated  using  Taylor  poly¬ 
nomials  (see  Application  12.18)  or  numerical  integration  methods  (see  Sect.  13.1). 
This  is  of  particular  interest  if  the  antiderivative  is  not  an  elementary  function,  as  it 
is  the  case  for  the  Gaussian  error  function  from  Example  10.10. 


1 1 .3  Applications  of  the  Definite  Integral 

We  now  turn  to  further  applications  of  the  definite  integral,  which  confirm  the  mod¬ 
elling  power  of  the  notion  of  the  Riemann  integral. 


The  volume  of  a  solid  of  revolution.  Assume  first  that  for  a  three-dimensional 
solid  (possibly  after  choosing  an  appropriate  Cartesian  coordinate  system)  the  cross- 
sectional  area  A  =  A(x)  is  known  for  every  v  e  [a,  b];  see  Fig.  1 1.5.  The  volume  of 
a  thin  slice  of  thickness  Ax  is  approximately  equal  to  A(x)Ax.  Writing  down  the 
Riemann  sums  and  taking  limits  one  obtains  for  the  volume  V  of  the  solid 

V  =  f  A(x)  dx. 

J  a 

A  solid  of  revolution  is  obtained  by  rotating  the  plane  curve  y  =  f(x),a<x<b 
around  the  v-axis.  In  this  case,  we  have  A(x)  =  7 r/(v)2,  and  the  volume  is  given  by 


V 


f(x )2  dx. 


Fig.  11.5  Solid  of 
revolution,  volume 


y 

A 
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Fig.  11.6  A  cone 


V 


Example  11.16  (Volume  of  a  cone)  The  rotation  of  the  straight  line  y  =  ^x  around 
the  v-axis  produces  a  cone  of  radius  r  and  height  h  (Fig.  11.6).  Its  volume  is  given 
by 


V 


x—h 


3 


x=0 


=  7T  r 


2 


h 

3' 


Arc  length  of  the  graph  of  a  function.  To  determine  the  arc  length  of  the  graph 
of  a  differentiable  function  with  continuous  derivative,  we  first  partition  the  interval 
\a,  b], 


a  =  x o  <  xi  <  X2  <  •  •  •  <  xn  =  b, 


and  replace  the  graph  y  =  fix)  on  [a,  b]  by  line  segments  passing  through  the  points 
(vq,  fix  o)),  (x\ ,  f(x  i)),  . . . ,  ixn,  fixn)).  The  total  length  of  the  line  segments  is 


sn  = 


H  / - 

v  (*j 


-  Xj- 1)2  +  (f(Xj)  -  f(Xj- 1)); 


i= i 


It  is  simply  given  by  the  sum  of  the  lengths  of  the  individual  segments  (Fig.  11.7). 
According  to  the  mean  value  theorem  (Proposition  8.4)  we  have 


n  l - 

Sn  =  12  V  C Xj  -  Xj- 1>2  +  f'l£j)2(Xj  -  Xj- 1)2 


7  =  1 
n 


—  1  +  f'i^j)2  ixj  xj- i) 

7  =  1 


Fig.  11.7  The  arc  length  of 
a  graph 
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with  certain  points  e  [xj- 1,  Xj].  The  sums  sn  are  easily  identified  as  Riemann 
sums.  Their  limit  is  thus  given  by 


S=  yj  1  +  f'{x)2  Ax. 

J  a 

Lateral  surface  area  of  a  solid  of  revolution.  The  lateral  surface  of  a  solid  of 
revolution  is  obtained  by  rotating  the  curve  y  =  f(x),a<x<b  around  the  v-axis. 

In  order  to  determine  its  area,  we  split  the  solid  into  small  slices  of  thickness  Ax. 
Each  of  these  slices  is  approximately  a  truncated  cone  with  generator  of  length  /As 
and  mean  radius  f(x);  see  Fig.  11.8.  According  to  Exercise  1 1  of  Chap.  3  the  lateral 
surface  area  of  this  truncated  cone  is  equal  to  2nf(x)As.  According  to  what  has 
been  said  previously,  As  ~  yj\  +  f'(x )2  Ax  and  thus  the  lateral  surface  area  of  a 
small  slice  is  approximately  equal  to 


2tt f(x)yj  1  +  /'(-O2  Ax. 


Writing  down  the  Riemann  sums  and  taking  limits  one  obtains 


M  =  2n  f  f(x)yj  1  +  f{x)2  Ax 

J  Cl 


for  the  lateral  surface  area. 

Example  11.17  (Surface  area  of  a  sphere)  The  surface  of  a  sphere  of  radius  r  is 
generated  by  rotation  of  the  graph  f(x)  =  Vr2  —  x2,  —r<x<r.  One  obtains 


Fig.  11.8  Solid  of  rotation, 
curved  surface  area 


y 

\ 
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1.  Modify  the  MATLAB  program  mat  1 1_1 .  m  so  that  it  evaluates  Riemann  sums  of 
given  lengths  n  for  polynomials  of  degree  k  on  arbitrary  intervals  [a,  b ]  (MATLAB 
command  polyval). 

2.  Prove  that  every  function  which  is  piecewise  constant  in  an  interval  [a,  b]  is 
Riemann  integrable  (use  Definition  11.3). 

3.  Compute  the  area  between  the  graphs  of  y  =  sin  x  and  y  =  x  on  the  interval 
[0,  2tt]. 

4.  (From  engineering  mechanics;  Fig.  1 1.9)  The  shear  force  Q(x)  and  the  bending 
moment  M(x)  of  a  beam  of  length  L  under  a  distributed  load  p(x)  obey  the 
relationships  M'(x )  =  Q(x),  Q'(x)  =  —p(x),  0  <  x  <  L.  Compute  Q(x)  and 
M(x)  and  sketch  their  graphs  for 

(a)  a  simply  supported  beam  with  uniformly  distributed  load:  p(x)  =  po,  2(0)  = 
poL/2,  M(0)  =  0; 

(b)  a  cantilever  beam  with  triangular  load:  p(x)  =  qo(l  —  x/L),  Q(L)  =  0, 
M(L)  =  0. 

5.  Write  a  MATLAB  program  which  provides  a  numerical  approximation  to  the 
integral 


For  this  purpose,  use  Riemann  sums  of  the  form 


L  =  e  Xj  Ax , 
7  =  1 


U  =  e  4-i  Ax 

7  =  1 


with  Xj  =  j Ax,  Ax  =  1  /n  and  try  to  determine  Ax  and  n,  respectively,  so 
that  U  —  L  <  0.01;  i.e.  the  result  should  be  correct  up  to  two  digits.  Com¬ 
pare  your  result  with  the  value  obtained  by  means  of  the  MATLAB  command 
sqrt (pi ) /2  *erf ( 1 ) . 

Additional  task :  Extend  your  program  such  that  it  allows  one  to  compute 
/q  e_v"  dx  for  arbitrary  a  >  0. 


Fig.  1 1 .9  Simply  supported  beam  with  uniformly  distributed  load,  cantilever  beam  with  triangular 
load 
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6.  Show  that  the  error  of  approximating  the  integral  in  Exercise  5  either  by  L  or  U 
is  at  most  U  —  L.  Use  the  applet  Integration  to  visualise  this  fact. 

Hint.  Verify  the  inequality 


L  < 


f 


—x 


dx  <  U. 


Thus,  L  and  U  are  lower  and  upper  sums ,  respectively. 

7.  Rotation  of  the  parabola  y  =  2^/x,  0  <  x  <  1  around  the  v-axis  produces  a 
paraboloid.  Sketch  it  and  compute  its  volume  and  its  lateral  surface  area. 

8.  Compute  the  arc  length  of  the  graph  of  the  following  functions: 

(a)  the  parabola  fix)  =  x2 /2  for  0  <  x  <  2; 

(b)  the  catenary  g(x)  =  coshv  for  —  1  <  x  <  3. 

Hint.  See  Exercise  7  in  Sect.  10.3. 

9.  The  surface  of  a  cooling  tower  can  be  described  qualitatively  by  rotating  the 
hyperbola  y  =  Vl  +  x2  around  the  v-axis  in  the  bounds  —1  <  x  <2. 

(a)  Compute  the  volume  of  the  corresponding  solid  of  revolution. 

"  2  i - 

(b)  Show  that  the  lateral  surface  area  is  given  by  M  =  2tt  f_l  VI  +  2x2  dx. 
Evaluate  the  integral  directly  and  by  means  of  maple. 

Hint.  Reduce  the  integral  to  the  one  considered  in  Exercise  7  of  Sect.  10.3  by  a 
suitable  substitution. 

10.  A  lens- shaped  body  is  obtained  by  rotating  the  graph  of  the  sine  function  y  = 
sin  v  around  the  v-axis  in  the  bounds  0  <  x  <  n. 

(a)  Compute  the  volume  of  the  body. 

(b)  Compute  its  lateral  surface  area. 

Hint.  For  (a)  use  the  identity  sin2  v  =  ^  (1  —  cos  2x);  for  (b)  use  the  substitution 
g(x)  =  cos  x. 

11.  (From  probability  theory)  Let  X  be  a  random  variable  with  values  in  an  inter¬ 
val  [a,  b]  which  possesses  a  probability  density  f(x),  that  is,  f(x)  >  0  and 

fa  d v  =  1.  Its  expectation  value  p  =  E(X),  its  second  moment  E(X2) 

and  its  variance  V (X)  are  defined  by 


E(X)  = 


V(X)  = 


xf(x)  dx ,  E(X2)  = 

(x  —  p)  f{x)  dx. 


x2  f(x)  dx , 


Show  that  V(X)  =  E(X2)  -  p2. 
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12.  Compute  the  expectation  value  and  the  variance  of  a  random  variable  which  has 

(a)  a  uniform  distribution  on  [a,  b],  i.e.  f(x)  =  1  /(b  —  a)  for  a  <  v  <  b\ 

(b)  a  (special)  beta  distribution  on  \a,  b ]  with  density  f(x)  =  6(x  —  a)(b  —  x)/ 
(b  —  a)3. 

13.  Compute  the  expectation  value  and  the  variance  of  a  random  variable  which  has 
a  triangular  distribution  on  [a,  b ]  with  modal  value  m ,  i.e. 

2(x  —  a) 

-  for  a  <  x  <  m , 

(b  —  a)(m  —  a ) 

2 (b-x)  e  ^  ^  , 

-  tor  m  <  x  <  b. 

(b  —  a)(b  —  m) 


® 

Check  for 
updates 


Taylor  Series 


Approximations  of  complicated  functions  by  simpler  functions  play  a  vital  part  in 
applied  mathematics.  Starting  with  the  concept  of  linear  approximation  we  discuss 
the  approximation  of  a  function  by  Taylor  polynomials  and  by  Taylor  series  in  this 
chapter.  As  important  applications  we  will  use  Taylor  series  to  compute  limits  of 
functions  and  to  analyse  various  approximation  formulas. 


1 2.1  Taylor's  Formula 

In  this  section  we  consider  the  approximation  of  sufficiently  smooth  functions  by 
polynomials  as  well  as  applications  of  these  approximations.  We  have  already  seen 
an  approximation  formula  in  Chap.  7:  Let  /  be  a  function  that  is  differentiable  at  a . 
Then 

fix)  «  g(x)  =  f  (a)  +  f\a)  •  (x  -  a), 

for  all  v  close  to  a.  The  linear  approximation  g  is  a  polynomial  of  degree  1  in  x,  and 
its  graph  is  just  the  tangent  to  /  at  a.  We  now  want  to  generalise  this  approximation 
result. 

Proposition  12.1  (Taylor’s  formula1)  Let  I  c  Rbe  an  open  interval  and  f :  I  — >  R 
an  (n  +  l) -times  continuously  differentiable  function  (i.e.,  the  derivative  of  order 
(n  +  1)  of  f  exists  and  is  continuous ).  Then,  for  all  x,  a  e  I, 
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,  f"(a)  ?  f^n\a)  „ 

fix)  =  f(a)  +  f  (a)  •  (x  —  a)  H - — —  (x  —  a)2  +  •  •  •  4 - ; —  (x  —  a) 


2! 


n\ 


+  Rn+i(x,a ) 


with  the  remainder  term  (in  integral  form) 


Rn+\(x,a)  =  —  f  (x  -  t)nf(n+1)  (t)At. 

«!  Ja 


Alternatively  the  remainder  term  can  be  expressed  by 

D  ,  ,  f(n+l)(0 ,  ,„+i 

Rn+ lix,  a)  =  -7 — —77-  (*  -  a)  ^  , 

(n  +  1)! 

where  £  is  a  point  between  a  and  x  (Lagrange’s2  form  of  the  remainder  term). 


Proof  According  to  the  fundamental  theorem  of  calculus,  we  have 


/ 


x 


fit)  At  =  f{x)  -  /(a), 


and  thus 


f(x)  =  f(a)+  (  fit)  At. 

J  a 

We  apply  integration  by  parts  to  this  formula.  Due  to 


/' 


u'(t)v(t)dt  =  u(t)v(t) 


x 


a 


/' 


—  I  u(t)v'(t)dt 


with  u(t)  —  t  —  x  and  v(t)  =  f'(t )  we  get 


/(*)  = /(a) +  (f -*)/'(*) 


-  I  it-  x)f"it )  At 


/'< 


=  f(a )  +  /'(a)  •  (x  -  a)  +  /  (x  -  f)/"(0  d/ 


A  further  integration  by  parts  yields 


/*< 


(x -*)/"(*)  df  =  - 


(x  —  /)' 


/"« 


/ 


+  /  ^AA/'"(Odr 


^  Ja  2 

/"(«)  ,  1  f J  9 

— — ( x-a)2  +  -J  ( x-t)2f  {t) At, 


J.L.  Lagrange,  1736-1813. 
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and  one  recognises  that  repeated  integration  by  parts  leads  to  the  desired  formula 
(with  the  remainder  term  in  integral  form).  The  other  representation  of  the  remainder 
term  follows  from  the  mean  value  theorem  for  integrals  [4,  Chap.  5,  Theorem  5.4]. 

□ 


Example  12.2  (Important  special  case)  If  one  sets  x  =  a  +  h  and  replaces  a  by  x 
in  Taylor’s  formula,  then  one  obtains 

h2  hn  hn+l 

fix  +  h)  =  fix )  +  h  fix)  +  —  fix)  +  •••+  —  fn\x)  +  — —  fn+l)iO 

with  a  point  £  between  v  and  x  -\-h.  For  small  h  this  formula  describes  how  the 
function  /  behaves  near  v. 


Remark  12.3  Often  one  does  not  know  the  remainder  term 

„  ,  ,  fn+l)iO ,  ,n+l 

Rn+i(x,  a)  =  — — — —  (x-a)  ^ 

(n  +  1)! 

explicitly  since  £  is  unknown  in  general.  Let  M  be  the  supremum  of  [ n  the 

considered  interval  around  a.  For  x  in  this  interval  we  obtain  the  bound 


Rn+i(x,a) 


M 


< 


(w  +  1)! 


(x  —  a)nJrl . 


The  remainder  term  is  thus  bounded  by  a  constant  times  hn+l,  where  h  =  x  —  a.  In 
this  situation,  one  writes  for  short 

Rn+\(a  +  h,a)  =  0(hn+l) 

as  h  -f  0  and  calls  the  remainder  a  term  of  order  n  +  1 .  This  notation  is  also  used 
by  maple. 


Definition  12.4  The  polynomial 


(q\ 

Tn(x,  a)  =  f  (a)  +  f\a)  •  (x  -  a)  H - 1-  - — ■ — (x  -  a)n 

n\ 

is  called  ^th  Taylor  polynomial  of  /  around  the  point  of  expansion  a. 

The  graphs  of  the  functions  y  =  Tn(x,  a)  and  y  =  f(x)  both  pass  through  the 
point  ( a ,  f(a)).  Their  tangents  in  this  point  have  the  same  slope  T„(x,  a)  =  f  (a ) 
and  the  graphs  have  the  same  curvature  (due  to  Tf(x,  a)  =  f"(a),  see  Chap.  14).  It 
depends  on  the  size  of  the  remainder  term  how  well  the  Taylor  polynomial  approxi¬ 
mates  the  function. 
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Example  12.5  (Taylor  polynomial  of  the  exponential  function)  Let  f{x)  =  tx  and 
<2  =  0.  Due  to  (exy  =  e*  we  have  f^k\ 0)  =  e°  =  1  for  all  k  >  0  and  hence 


V' 


1  +  v  +  —  H - h 


n 


£ 


T - x 

n\  (w  +  1)! 


72  +  1 


where  £  denotes  a  point  between  0  and  x .  We  want  to  determine  the  minimal  degree  of 
the  Taylor  polynomial  which  approximates  the  function  in  the  interval  [0,  1],  correct 
to  5  digits.  For  that  we  require  the  following  bound  on  the  remainder  term 


eA  —  1  —  +  —  •••  — 


n 


n\ 


£ 


(w  +  1)! 


72+1 


<  10- 


Note  that  v  g  [0,  1]  as  well  as  e^  are  non-negative.  The  above  remainder  will  be 
maximal  for  x  =  £  =  1.  Thus  we  determine/?  from  the  inequality  e/(n  +  1)!  <  10-5. 
Due  to  e  ~  3  this  inequality  is  certainly  fulfilled  from  n  =  8  onwards;  in  particular, 

1  1  s 

e  =  l  +  l  +  -  +  -..  +  -±10“5. 

2  8! 

One  has  to  choose  n  >  8  in  order  to  determine  the  first  5  digits  of  e. 


Experiment  12.6  Repeat  the  above  calculations  with  the  help  of  the  maple  work¬ 
sheet  mpl2_l .  mws.  In  this  worksheet  the  required  maple  commands  for  Taylor’s 
formula  are  explained. 


Example  12.7  (Taylor  polynomial  of  the  sine  function)  Let  f(x)  =  sinv  and 
<2  =  0.  Recall  that  (sinx)'  =  cosx  and  (cos*)'  =  —  sinv  as  well  as  sinO  =  0  and 
cos  0=1.  Therefore, 


sinx 


2ti+ 1 

=  E 

k= 0 


sin^(0)  k 

- — - xk  +  R2n+2(x,  0)  = 

k\ 


\  n 


X 


2ti+ 1 


3  5  7 

X  X  X 

=  x  —  + - +  ---  +  (-iy 

3!  5!  7!  {In  +  1)! 


+  ^2n+2(*>  0), 


Note  that  the  Taylor  polynomial  consists  of  odd  powers  of  v  only.  According  to 
Taylor’s  formula,  the  remainder  has  the  form 


R277+2CL  0)  = 


sin 


(2ti+2) 


(O 


(2n  +  2)! 


2ti+2 


Since  all  derivatives  of  the  sine  function  are  bounded  by  1 ,  we  obtain 


IR277+2  Cl  0)|  < 


2tz +2 


(2n  +  2)! 
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For  fixed  x  the  remainder  term  tends  to  zero  as  n  ->  oo,  since  the  expression 
x2n+2/(2n  +  2)!  is  a  summand  of  the  exponential  series,  which  converges  for  all 
x  e  R.  The  above  estimate  can  be  interpreted  as  follows:  For  every  xgR  and  e  >  0, 
there  exists  an  integer  N  e  N  such  that  the  difference  of  the  sine  function  and  its  nth 
Taylor  polynomial  is  small;  more  precisely, 

|  sin  t  —  Tn(t,  0)|  <  £ 


for  all  n  >  N  and  t  e  [ — jc,  x]. 


Experiment  12.8  Using  the  maple  worksheet  mpl2_2  .mws  compute  the  Taylor 
polynomials  of  sin  x  around  the  point  0  and  determine  the  accuracy  of  the  approxi¬ 
mation  (by  plotting  the  difference  to  sin  x).  In  order  to  achieve  high  accuracy  for  large 
v,  the  degree  of  the  polynomials  has  to  be  chosen  sufficiently  high.  Due  to  rounding 
errors,  however,  this  procedure  quickly  reaches  its  limits  (unless  one  increases  the 
number  of  significant  digits). 


Example  12.9  The  4th  degree  Taylor  polynomial  74  (x,  0)  of  the  function 


is  given  by 


x  7^  0, 

x  =  0, 


74(U0) 


1  4 

- v 

720 


Experiment  12.10  The  maple  worksheet  mpl2_3.mws  shows  that,  for  suffi¬ 
ciently  large  n,  the  Taylor  polynomial  of  degree  n  gives  a  good  approximation  to  the 
function  from  Example  12.9  on  closed  subintervals  of  (— 2tt,  2i r).  For  x  >2i r  (as 
well  as  for  x  <  —2tt)  the  Taylor  polynomial  is,  however,  useless. 


1 2.2  Taylor's  Theorem 

The  last  example  gives  rise  to  the  question  for  which  points  the  Taylor  polynomial 
converges  to  the  function  as  n  — >  oo. 

Definition  12.11  Let  7  c  R  be  an  open  interval  and  let  /:  7  — >  R  have  arbitrarily 
many  derivatives.  Given  a  e  7,  the  series 

tv  n  /W(«)  , 

T (x,  a,  f)  =  y  — — —  (x  -  a) 

f “  k\ 

k= 0 

is  called  Taylor  series  of  /  around  the  point  a. 
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Proposition  12.12  (Taylor’s  theorem)  Let  f:  I  — >  R  be  a  function  with  arbitrarily 
many  derivatives  and  let  T(x,  a,  /)  be  its  Taylor  series  around  the  point  a.  Then  the 
function  and  its  Taylor  series  coincide  at  x  e  I,  i.e., 


oo 


fix)  =  J2 

k= 0 


f(k){a) 

k\ 


if  and  only  if  the  remainder  term 


Rnfx,  a) 


/(n)(Q 

n\ 


n 


tends  to  0  as  n  — >  oo. 

Proof  According  to  Taylor’s  formula  (Proposition  12.1), 

fix)  -  Tn(x,  a)  =  Rn+i(x,  a) 


and  hence 


fix)  =  lim  Tn(x,  a) 

n^oo 


T(x,a,f)  lim  Rn(x,a)  =  0, 

n^oo 


which  was  to  be  shown. 


□ 


Example  12.13  Let  fix)  =  sin*  and  <2  =  0.  Due  to  Rnix,  0)  =  sm  xn  we  have 


Rnix,  0)|  < 


\X 


I  n 


0 


n 


for  v  fixed  and  n  ->  oo.  Hence  for  all  x  e  R 


oo 


sin  v 


=  £<-»' 


2k+l 


k=0 


ilk  +  1)! 


3  5  7  9 

-y*  "V*  '  ”V*  ^ 

•/V  ./V  ./V 

=  x  —  —  T  —  —  —  T  —  T"  •  •  • 
3!  5!  7!  9! 


1 2.3  Applications  of  Taylor's  Formula 

To  complete  this  chapter  we  discuss  a  few  important  applications  of  Taylor’s  formula. 

Application  12.14  (Extremum  test)  Let  the  function  /:  I  R  be  n -times  contin¬ 

uously  differentiable  in  the  interval  I  and  assume  that 


/'(«)  =  f"{a)  =  ■■■  =  f(n~l\a )  =  0  and  f(n\a)  ±  0. 


1 2.3  Applications  of  Taylor's  Formula 
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Then  the  following  assertions  hold: 

(a)  The  function  /  has  an  extremum  at  a  if  and  only  if  n  is  even; 

(b)  if  n  is  even  and  f^n\a)  >  0,  then  a  is  a  local  minimum  of  /; 
if  n  is  even  and  f^n\a)  <  0,  then  a  is  a  local  maximum  of  /. 

Proof  Due  to  Taylor’s  formula,  we  have 

X  £  /. 

If  v  is  close  to  a ,  and  f^n\a)  have  the  same  sign  (since  is  continuous). 

For  n  odd  the  right-hand  side  changes  its  sign  at  x  =  a  because  of  the  term  (x  —  a)n . 
Hence  an  extremum  can  only  occur  for  n  even.  If  now  n  is  even  and  f^n\a)  >  0 
then  f(x)  >  f  (a)  for  all  v  close  to  a  with  x  ^  a.  Thus  a  is  a  local  minimum.  □ 

Example  12.15  The  polynomial  f(x)  =  6  +  4x  +  6x2  +  4v3  +  x4  has  the  deriva¬ 

tives 

/'(- 1)  =  /"(- 1)  =  1)  =  o,  /(4)(- 1)  =  24 

at  the  point  x  =  —1.  Hence  v  =  —  1  is  a  local  minimum  of  /. 


fix)  -  fid)  = 


fin)iO 


n 


(. x  —  a)n 


Application  12.16  (Computation  of  limits  of  functions)  As  an  example,  we  inves¬ 
tigate  the  function 


v2  log(l  +  x) 
(1  —  cos  x)  sin  v 


in  the  neighbourhood  of  v  =  0.  For  i  =  0we  obtain  the  undefined  expression  In 

order  to  determine  the  limit  when  v  tends  to  0,  we  expand  all  appearing  functions  in 

2 

Taylor  polynomials  around  the  point  a  =  0.  Exercise  1  yields  that  cos  v  =  1  —  + 

Q(x4).  Taylor’s  formula  for  log(l  +  x)  around  the  point  a  =  0  reads 


log(l  +  x)  =  v  +  0(x2) 


because  of  log  1  =  0  and  log(l  +  x)'\x=o  =  1-  We  thus  obtain 


x2(x  -hO(x2)) 

g(x)  = - ^ - 

(l  —  1  +  ^ 0{x4)^)(x  +  (7(v3)) 


v3  +  0(x4) 
4  +  0(x5) 


1  +  O(x) 

\  4“  (fix2) 


and  consequently  lim  g(x)  =  2. 


Application  12.17  (Analysis  of  approximation  formulas)  When  differentiating 
numerically  in  Chap.  7,  we  considered  the  symmetric  difference  quotient 


fix  +  h) 


2  f(x)  +  fjx 
h2 


fix) 
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as  an  approximation  to  the  second  derivative  f"(x).  We  are  now  in  the  position  to 
investigate  the  accuracy  of  this  formula.  From 

h 2 

fix  +  h)  =  f(x )  +  hf'ix)  +  —fix)  +  —  fix)  +  Oih 4), 

2  6 

h 2 

fix  -h)  =  fix )  -  /»/'(*)  +  — /"(x)  -  —/'"(*)  +  e>(ft4) 

2  6 

we  infer  that 


/(X  +  h)  +  fix  -h)  =  2 fix)  +  h2fix)  +  Oih *) 


and  hence 


f(x  +  h)-2f(x)  +  f(x-h) 

h 2 


fix)  +  Oih2). 


One  calls  this  formula  second-order  accurate.  If  one  reduces  h  by  the  factor  A,  then 
the  error  reduces  by  the  factor  A2,  as  long  as  rounding  errors  do  not  play  a  decisive 
role. 


Application  12.18  (Integration  of  functions  that  do  not  possess  elementary  inte¬ 
grals)  As  already  mentioned  in  Sect.  10.2  there  are  functions  whose  antiderivatives 

cannot  be  expressed  as  combinations  of  elementary  functions.  For  example,  the 

_  2 

function  f(x)  =  e~A  does  not  have  an  elementary  integral.  In  order  to  compute  the 
definite  integral 

q~x  dx, 

_  2 

we  approximate  e-A  by  the  Taylor  polynomial  of  degree  8 


e 


1  —  x2  + 


X6  X8 

6~  +  24 


and  approximate  the  integral  sought  after  by 


-*2  + 


5651 

7560' 


The  error  of  this  approximation  is  6.63  •  10  4.  For  more  precise  results  one  takes  a 
Taylor  polynomial  of  a  higher  degree. 


Experiment  12.19  Using  the  maple  worksheet  mpl2_4  .mws  repeat  the  calcula¬ 
tions  from  Application  12.18.  Subsequently  modify  the  program  such  that  you  can 
integrate  g(x)  =  cos(v2)  with  it. 


12.4  Exercises 
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1.  Compute  the  Taylor  polynomials  of  degree  0,  1,2,3  and  4  of  the  function  g(x)  = 
cos*  around  the  point  of  expansion  <2  =  0.  For  which  xgR  does  the  Taylor 
series  of  cos  v  converge? 

2.  Compute  the  Taylor  polynomials  of  degree  1,  3  and  5  of  the  function  sinv 
around  the  point  of  expansion  a  =  9tt.  Further,  compute  the  Taylor  polynomial 
of  degree  39  with  maple  and  plot  the  graph  together  with  the  graph  of  the 
function  in  the  interval  [0,  1  8tt] .  In  order  to  be  able  to  better  distinguish  the  two 
graphs  you  should  plot  them  in  different  colours. 

3.  Compute  the  Taylor  polynomials  of  degree  1,  2  and  3  of  the  function  f{t)  = 
Vl  +  t  around  the  point  of  expansion  <2  =  0.  Further  compute  the  Taylor  poly¬ 
nomial  of  degree  10  with  maple. 

4.  Compute  the  following  limits  using  Taylor  series  expansion: 


v  sin  v  —  v' 


lim  , 

2  cos  v  —  2  -j-  xz 


-x 


-  1 


lim  — 

sinz(3v) 


lim 


lim 


e2x  —  1  —  2x 
sinz  v 

x2(log(l  -  lx))2 
1  —  cos(v2) 


Verify  your  results  with  maple . 

5.  For  the  approximate  evaluation  of  the  integral 


d  t 


replace  the  integrand  by  its  Taylor  polynomial  of  degree  9  and  integrate  this 
polynomial.  Verify  your  result  with  maple. 

6.  Prove  the  formula 

e1(^  =  cos  p  +  i  sin  p 

by  substituting  the  value  i<^  for  v  into  the  series  of  the  exponential  function 


oo 


E 


k 


k\ 


and  separating  real  and  imaginary  parts. 

7.  Compute  the  Taylor  series  of  the  hyperbolic  functions  f(x)  =  sinhv  and 
g(x)  =  cosh  v  around  the  point  of  expansion  <2  =  0  and  verify  the  convergence 
of  the  series. 

Hint.  Compute  the  Taylor  polynomials  of  degree  n  —  1  and  show  that  the  remain¬ 
der  terms  Rn(x,  0)  can  be  estimated  by  (cosh  M)Mn /n\  whenever  \x\  <  M. 
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12  Taylor  Series 


8.  Show  that  the  Taylor  series  of  f(x)  =  log(l  T  x)  around  a  =  0  is  given  by 


log(l+x)  =  ^(-l)^1 

k= 1 


k 


2  3 

3T  XJ 


for  \x\  <  1. 

Hint.  A  formal  calculation,  namely  an  integration  of  the  geometric  series 
expansion 


1 

1  T 1 


1 


l  -  (-0 


from  t  =  0  to  t  =  x,  suggests  the  result.  For  a  rigorous  proof  of  convergence, 
the  remainder  term  has  to  be  estimated.  This  can  be  done  by  integrating  the 
remainder  term  in  the  geometric  series 


1 

1  + 1 


n— 1 

-£(-dV 

7=0 


1  1  —  (— 1  )ntn 

It  t  It  t 


(-1  )ntn 

it  t 


observing  that  1  T  t  >  S  >  0  for  some  positive  constant  S  as  long  as  1 1 1  <  \x\  <  1 . 
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Numerical  Integration 


The  fundamental  theorem  of  calculus  suggests  the  following  approach  to  the 
calculation  of  definite  integrals:  one  determines  an  antiderivative  F  of  the  integrand 
/  and  computes  from  that  the  value  of  the  integral 

b 

f{x)  dx  =  F(b )  —  F(a). 

In  practice ,  however,  it  is  difficult  and  often  even  impossible  to  find  an  antiderivative 
F  as  a  combination  of  elementary  functions.  Apart  from  that,  antiderivatives  can 
also  be  fairly  complex,  as  the  example  /  x 100  sin  x  dx  shows.  Finally,  in  concrete 
applications  the  integrand  is  often  given  numerically  and  not  by  an  explicit  formula. 
In  all  these  cases  one  reverts  to  numerical  methods.  In  this  chapter  the  basic  concepts 
of  numerical  integration  (quadrature  formulas  and  their  order)  are  introduced  and 
explained.  By  means  of  instructive  examples  we  analyse  the  achievable  accuracy  for 
the  Gaussian  quadrature  formulas  and  the  required  computational  effort. 


1 3.1  Quadrature  Formulas 

For  the  numerical  computation  of  Ja  fix)  dx  we  first  split  the  interval  of  integration 
\a,  b ]  into  subintervals  with  grid  points  a=x  o  <  x\  <  X2  <  ...  <  xjy~i  <  xjy  =  b, 
see  Fig.  13.1.  From  the  additivity  of  the  integral  (Proposition  11.10(d))  we  get 

rb  rxJ+i 

/  f(x)dx  =  j:  /  f(x)  dx. 

Ja  J=0  Jxj 
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Fig.  13.1  Partition  of  the  interval  of  integration  into  subintervals 


Hence  it  is  sufficient  to  find  an  approximation  formula  for  a  (small)  subinterval 
of  length  hj  =  xj+ \  —  xj.  One  example  of  such  a  formula  is  the  trapezoidal  rule 
through  which  the  area  under  the  graph  of  a  function  is  approximated  by  the  area  of 
the  corresponding  trapezoid  (Fig.  13.2) 


fix)  dx 


For  the  derivation  and  analysis  of  such  approximation  formulas  it  is  useful  to  carry 
out  a  transformation  onto  the  interval  [0,  1].  By  setting  v  =  xj  +  rhj  one  obtains 
from  dx  =  hj  dr  that 


fix)  dx  = 


f(xj  +  rhj)hj  dr  =  hj 


l 

g{r)  dr 


with  g(r)  =  f(xj  +  rhj).  Thus  it  is  sufficient  to  find  approximation  formulas  for 
Jo  9(r)  dr.  The  trapezoidal  rule  in  this  case  is 


l 

gir)  dr 


Obviously,  it  is  exact  if  g(r)  is  a  polynomial  of  degree  0  or  1. 


Fig.  1 3.2  Trapezoidal  rule 


0 


1 
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In  order  to  obtain  a  more  accurate  formula,  we  demand  that  quadratic  polynomials 
are  integrated  exactly  as  well.  For  the  moment  let 

g(r )  =  a  +  (3r  +  yr- 


be  a  general  polynomial  of  degree  2.  Due  to  g(0)  =  a,  g(j3}  =  a  +  \(3  +  \ 7  and 
g(l)  =  a -\-  (3  -\-  7  we  get  by  a  short  calculation 


11 

+  /3t  +  7rz)  dr  =  cr  +  -/?  +  -7  = 


1 

6 


The  corresponding  approximation  formula  for  general  g  reads 


g(r)  dr  %  I  (g(0)  +  4p(|)  + 

By  construction,  it  is  exact  for  polynomials  of  degree  less  than  or  equal  to  2;  it  is 
called  Simpson's  rule. 

The  special  forms  of  the  trapezoidal  and  of  Simpson’s  rule  motivate  the  following 
definition. 


Definition  13.1  The  approximation  formula 

n\  s 

/  g(r)  dr  «  g(a) 

Jo  i=l 

is  called  a  quadrature  formula.  The  numbers  b\, . . . ,  bs  are  called  weights ,  and  the 
numbers  c\ ,  . . . ,  cs  are  called  nodes  of  the  quadrature  formula;  the  integer  s  is  called 
the  number  of  stages. 

A  quadrature  formula  is  determined  by  the  specification  of  the  weights  and  nodes. 
Thus  we  denote  a  quadrature  formula  by  {(/?*• ,  q),  i  =  1,  . . . ,  s]  for  short.  Without 
loss  of  generality  the  weights  bi  are  not  zero,  and  the  nodes  are  pairwise  different 
fa  7^  ck  for  i  f^k). 


Example  13.2  (a)  The  trapezoidal  rule  has  s  =  2  stages  and  is  given  by 

1 

b\  =  b2  =  ci  =0,  c2  =  1. 


1T.  Simpson,  1710-1761. 
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(b)  Simpson’s  rule  has  s  =  3  stages  and  is  given  by 


12  1  1 
£1  =  7,  b2=-,  £3  =  7,  c  1=0,  C2=-,  C3  =  1 

6  3  6  2 


In  order  to  compute  the  original  integral  /  /(v)  dv  by  quadrature  formulas,  one 

i/t 

has  to  reverse  the  transformation  from  /  to  g.  Due  to  g(r)  =  f(xj  +  rhj )  one 
obtains 

/■*7+i  /•!  5  ^  5  ^ 

/  /(x)  d  X  =  hj  /  p(r)  dr  hj  V  bigici)  =  hj2_]  bifixj  +  Cihj), 

Jxi  Jo  r~r 


/  — 1 


/  — 1 


and  thus  the  approximation  formula 


TV — 1 


TV-1 


pb  iy  x  /**/+i 

/  /(x)dx  =  ^3  /  /(*)d*  ~  &*/(*/  +q/j7). 

^  7=0  ^  7=0  Z=1 

We  now  look  for  quadrature  formulas  that  are  as  accurate  as  possible.  Since  the 
integrand  is  typically  well  approximated  by  Taylor  polynomials  on  small  intervals, 
a  good  quadrature  formula  is  characterised  by  the  property  that  it  integrates  exactly 
as  many  polynomials  as  possible.  This  idea  motivates  the  following  definition. 

Definition  13.3  (Order)  The  quadrature  formula  {(/?/,  c/),  i  =  1 ,  . . . ,  v}  has  order 
p  if  all  polynomials  g  of  degree  less  or  equal  to  p  —  1  are  integrated  exactly  by  the 
quadrature  formula;  i.e., 


pi  s 

/  g(r)  dr  =  YV,'  g(cj  ) 
Jo  i= 1 


for  all  polynomials  g  of  degree  smaller  than  or  equal  to  p  —  1 . 

Example  13.4  (a)  The  trapezoidal  rule  has  order  2. 

(b)  Simpson’s  rule  has  (by  construction)  at  least  order  3. 

The  following  proposition  yields  an  algebraic  characterisation  of  the  order  of 
quadrature  formulas. 


Proposition  13.5  A  quadrature  formula  {(£>/,  q),  i  =  1,  . . . ,  s}  has  order  p  if  and 
only  if 


s  j 

T  bi  cj  1  =  —  for  1  <  q  <  p. 

U  q 
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Proof  One  uses  the  fact  that  a  polynomial  g  of  degree  p  —  1 

g(r )  =  ao  +  crir  +  . . .  +  a^-i  rp_1 

is  a  linear  combination  of  monomials,  and  that  both  integration  and  application  of  a 
quadrature  formula  are  linear  processes.  Thus  it  is  sufficient  to  prove  the  result  for 
the  monomials 


g(r)  =  Tq  \  1  <  q  <  p. 

The  proposition  now  follows  directly  from  the  identity 


1 

q 


s  s 

dr  =  yr?,  g{a)  = 

i= 1  /  — 1 


The  conditions  of  the  proposition 


□ 


f  •  •  •  f  ^  =  1 
b\c\  +  "T  •  •  •  +  bscs  =  ^ 
b\c\  +  ^2^2  T  •  •  •  T  bsc^  =  5 

b\c[~l  +  b2cl2~1  +  . . .  +  bscf~l  =  j 

are  called  order  conditions  of  order  p.  If  s  nodes  c\ ,  . . . ,  cs  are  given  then  the  order 
conditions  form  a  linear  system  of  equations  for  the  unknown  weights  bi .  If  the 
nodes  are  pairwise  different  then  the  weights  can  be  determined  uniquely  from  that. 
This  shows  that  for  s  different  nodes  there  always  exists  a  unique  quadrature  formula 
of  order  p  >  s. 

Example  13.6  We  determine  once  more  the  order  of  Simpson’s  rule.  Due  to 

^l+^2+^3  —  ^  +  f  +  ^  =  l 

b\C\  +  b2C2  +  b^C2  =  |  •  \  +  \  =  \ 

b\c\  +  b2c\  +  =  \  '  \  \  =  \ 

its  order  is  at  least  3  (as  we  already  know  from  the  construction).  However, 
additionally 


/  3  i  /  3  i  /  3  4  lil  3  1 

°\C\  +  b2c2  +  b2c3  —  g  •  g  +  g  —  12  —  4> 


i.e.,  Simpson’s  rule  even  has  order  4. 
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The  best  quadrature  formulas  (high  accuracy  with  little  computational  effort)  are 
the  Gaussian  quadrature  formulas.  For  that  we  state  the  following  result  whose  proof 
can  be  found  in  [23,  Chap.  10,  Corollary  10.1]. 

Proposition  13.7  There  is  no  quadrature  formula  with  s  stages  of  order  p  >  2s.  On 
the  other  hand,  for  every  s  e  N  there  exists  a  ( unique )  quadrature  formula  of  order 
p  =  2s.  This  formula  is  called  s  -stage  Gaussian  quadrature  formula. 


The  Gaussian  quadrature  formulas  for  s  <  3  are 


s  =  l  :  c  i  = 


1 

-,  b i  =  l,  order  2  (midpoint  rule): 
2 


s  =  2:  c\  =  -  - 


s  =  3  :  c\  =  -  — 


V3 


h  = 


1 
2 

1  Vl5 

2  lo"’ 

b2  = 


1  V3 

6  C1  ~  2^ 


6  b'=b'  =  \- 


order  4: 


1  1  Vl5 

C'  2  —  — ,  C  3  —  —  T  - , 

2  2  10 


18’ 


8 

18  ’ 


b3  = 


18’ 


order  6. 


1 3.2  Accuracy  and  Efficiency 


In  the  following  numerical  experiment  the  accuracy  of  quadrature  formulas  will  be 
illustrated.  With  the  help  of  the  Gaussian  quadrature  formulas  of  order  2,  4  and  6  we 
compute  the  two  integrals 


cos  x  dx  =  sin  3 


and 


For  that  we  choose  equidistant  grid  points 

Xj=a-\-jh,  j  =  0,  . . . ,  N 

with  h  —  (b  —  a)/N  andAf  =  1,  2,  4,  8,  16,  ... ,  5 12.  Finally,  we  plot  the  costs  of  the 
calculation  as  a  function  of  the  achieved  accuracy  in  a  double-logarithmic  diagram. 

A  measure  for  the  computational  cost  of  a  quadrature  formula  is  the  number  of 
required  function  evaluations ,  abbreviated  by  f  e.  For  an  ^ -stage  quadrature  formula, 
it  is  the  number 


f  e  =  s  •  N. 


The  achieved  accuracy  err  is  the  absolute  value  of  the  error.  The  according  results 
are  presented  in  Fig.  13.3.  One  makes  the  following  observations: 


13.2  Accuracy  and  Efficiency 
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Fig.  1 3.3  Accuracy-cost-diagram  of  the  Gaussian  quadrature  formulas.  The  crosses  are  the  results 
of  the  one-stage  Gaussian  method  of  order  2,  the  squares  the  ones  of  the  two-stage  method  of  order 
4  and  the  circles  the  ones  of  the  three- stage  method  of  order  6 


(a)  The  curves  are  straight  lines  (as  long  as  one  does  not  get  into  the  range  of 
rounding  errors,  like  with  the  three- stage  method  in  the  left  picture). 

(b)  In  the  left  picture  the  straight  lines  have  slope  —  1  //?,  where  p  is  the  order  of 
the  quadrature  formula.  In  the  right  picture  this  is  only  true  for  the  method  of 
order  2,  and  the  other  two  methods  result  in  straight  lines  with  slope  —2/7. 

(c)  For  given  costs  the  formulas  of  higher  order  are  more  accurate. 

In  order  to  understand  this  behaviour,  we  expand  the  integrand  into  a  Taylor  series. 

On  the  subinterval  [a,  a  +  h]  of  length  h  we  obtain 


f(a  +  rh) 


E 


q= 0 


—  fiq)(a)Tq  +  0(hp). 

q ! 


Since  a  quadrature  formula  of  order  p  integrates  polynomials  of  degree  less  than  or 
equal  to  p  —  1  exactly,  the  Taylor  polynomial  of  /  of  degree  p  —  1  is  being  integrated 
exactly.  The  error  of  the  quadrature  formula  on  this  subinterval  is  proportional  to  the 
length  of  the  interval  times  the  size  of  the  remainder  term  of  the  integrand,  so 


h  ■  0(hp)  =  0(hp+1). 


In  total  we  have  N  subintervals;  hence,  the  total  error  of  the  quadrature  formula  is 
N  ■  0(hp+l)  =  Nh  ■  0(hp )  =  (b  —  a)  ■  0(hp)  =  0(hp). 

Thus  we  have  shown  that  (for  small  h)  the  error  err  behaves  like 


err  ^  c\  •  hp . 
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Since  furthermore 


f  e  =  sN  =  s  •  Nh  -  h  1  =  s  •  (b  —  a)  •  h  1  =  c2  •  h  1 


holds  true,  we  obtain 


log(fe)  =  log c2  —  1  ogh  and  log(err)  ~  logci  +  p  •  lo gh, 


so  altogether 


1 

log(f  e)  ~  C3 - -  log(err). 

P 

This  explains  why  straight  lines  with  slope  —  1/p  appear  in  the  left  picture. 

In  the  right  picture  it  has  to  be  noted  that  the  second  derivative  of  the  integrand 
is  discontinuous  at  0.  Hence  the  above  considerations  with  the  Taylor  series  are  not 
valid  anymore.  The  quadrature  formula  also  detects  this  discontinuity  of  the  high 
derivatives  and  reacts  with  a  so-called  order  reduction ;  i.e.,  the  methods  show  a 
lower  order  (in  our  case  p  =  7/2). 


Experiment  13.8  Compute  the  integrals 


dx 

x 


using  the  Gaussian  quadrature  formulas  and  generate  an  accuracy-cost-diagram. 
For  that  purpose  modify  the  programs  matl3_l  .m,  matl3_2  .m,  matl3_3  .m, 
mat!3_4  .  m  and  mat!3_5  .  m  with  which  Fig.  13.3  was  produced. 


Commercial  programs  for  numerical  integration  determine  the  grid  points  adap¬ 
tively  based  on  automatic  error  estimates.  The  user  can  usually  specify  the  desired 
accuracy.  In  MATLAB  the  routines  quad .  m  and  quadl .  m  serve  this  purpose. 


13.3  Exercises 


1.  For  the  calculation  of  v100  sin  x  dx  first  determine  an  antiderivative  F  of  the 
integrand  /  using  maple.  Then  evaluate  F(l)  —  F( 0)  to  10,  50,  100,  200  and 
400  digits  and  explain  the  surprising  results. 

2.  Determine  the  order  of  the  quadrature  formula  given  by 


1 

b\  —  b4  =  -,  b2  =  h 


3  1  2 

8’  Cl=°’  C2=3’  C3=3’ 


13.3  Exercises 
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3.  Determine  the  unique  quadrature  formula  of  order  3  with  the  nodes 


1  2 

c  1  =  C2  =  C3  =  1. 

4.  Determine  the  unique  quadrature  formula  with  the  nodes 


1 


Which  order  does  it  have? 

5.  Familiarise  yourself  with  the  MATLAB  programs  quad .  m  and  quadl .  m  for  the 
computation  of  definite  integrals  and  test  the  programs  for 


/' 


e  dx  and 


/  y/x  dx, 

Jo 


6.  Justify  the  formulas 


7T  =  4 


£  T 


dx 


+  x- 


/V r 

Jo 


and  7r  =  4  /  VI  —  v2  dx 


and  use  them  to  calculate  n  by  numerical  integration.  To  do  so  divide  the  interval 
[0,  1]  into  N  equally  large  parts  (N  =  10,  100,  . . .)  and  use  Simpson’s  rule  on 
those  subintervals.  Why  are  the  results  obtained  with  the  first  formula  always 
more  accurate? 

7.  Write  a  MATLAB  program  that  allows  you  to  evaluate  the  integral  of  any  given 
(continuous)  function  on  a  given  interval  [a,  b],  both  by  the  trapezoidal  rule  and 
by  Simpson’s  rule.  Use  your  program  to  numerically  answering  the  questions  of 
Exercises  7-9  from  Sect.  11.4  and  Exercise  5  from  Sect.  12.4. 

8.  Use  your  program  from  Exercise  7  to  produce  tables  (for  x  =  0tox  =  10in 
steps  of  0.5)  of  some  higher  transcendental  functions: 

(a)  the  Gaussian  error  function 


(b)  the  sine  integral 


Erf(x)  = 


Si(x)  = 


sin  y 

y 


(c)  the  Fresnel  integral 
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9.  (Experimental  determination  of  expectation  values)  The  family  of  standard  beta 
distributions  on  the  interval  [0,  1]  is  defined  through  the  probability  densities 

f(x-r,s)  =  — l—  xr-\l-x)s-\  0  <  x  <  1, 

B(r,  s ) 

where  r,  s  >  0.  Here  B(r,  s )  =  /  '(1  —  y)s  1  dy  is  the  beta  function, 

which  is  a  higher  transcendental  function  for  non-integer  values  of  r,s.  For 
integer  values  of  r,  s  >  1  it  is  given  by 


B(r,s ) 


(r  -  1  )\(s  -  1)! 
(r  +  s  —  1)! 


With  the  help  of  the  MATLAB  program  quad .  m,  compute  the  expectation  values 
/i(r,  s)  =  /q1  xf(x;r,s )  dx  for  various  integer  values  of  r  and  s  and  guess  a 
general  formula  for  fi(r,  s)  from  your  experimental  results. 
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Curves 


The  graph  of  a  function  y  =  f(x)  represents  a  curve  in  the  plane.  This  concept, 
however,  is  too  tight  to  represent  more  intricate  curves,  like  loops,  self-intersections, 
or  even  curves  of  fractal  dimension.  The  aim  of  this  chapter  is  to  introduce  the 
concept  of  parametrised  curves  and  to  study,  in  particular,  the  case  of  differentiable 
curves.  For  the  visualisation  of  the  trajectory  of  a  curve,  the  notions  of  velocity 
vector,  moving  frame,  and  curvature  are  important.  The  chapter  contains  a  collection 
of  geometrically  interesting  examples  of  curves  and  several  of  their  construction 
principles.  Further,  the  computation  of  the  arc  length  of  differentiable  curves  is 
discussed,  and  an  example  of  a  continuous,  bounded  curve  of  infinite  length  is  given. 
The  chapter  ends  with  a  short  outlook  on  spatial  curves.  For  the  vector  algebra  used 
in  this  chapter,  we  refer  to  Appendix  A. 


1 4.1  Parametrised  Curves  in  the  Plane 

Definition  14.1  A  parametrised  plane  curve  is  a  continuous  mapping 


t  i->  x(t)  = 


x{t) 

y(t) 


of  an  interval  [a,  b]  to  M2;  i.e.,  both  the  components  t  \->  x(t)  and  t  f->  y(t)  are 
continuous  functions.  The  variable  t  e  [a,  b]  is  called  parameter  of  the  curve. 


Concerning  the  vector  notation  we  remark  that  x(t),  y(t )  actually  represent  the  coordinates  of  a 
point  in  R2 .  It  is,  however,  common  practise  and  useful  to  write  this  point  as  a  position  vector,  thus 
the  column  notation. 
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Example  14.2  An  object  that  is  thrown  at  height  h  with  horizontal  velocity  vh  and 
vertical  velocity  vy  has  the  trajectory 


x(t)  =  vHt, 

/  \  i  S?  O  25:  t  ^  IQ, 

y(t )  =  h  +  vyt  —  1 1  , 

where  to  is  the  positive  solution  of  the  equation  h  +  vyto  —  |^q  =  0  (time  of 
impact,  see  Fig.  14.1).  In  this  example,  we  can  eliminate  t  and  represent  the  tra¬ 
jectory  as  the  graph  of  a  function  (ballistic  curve).  We  have  t  =  x/vh ,  and  thus 

j  i  vy  g  2 

y  =  h  H - v - tt  x  . 

VH  2v\ 

Example  14.3  A  circle  of  radius  R  with  centre  at  the  origin  has  the  parametric 
representation 


x(t)  =  Rcost, 

/X  „  .  0  <  t  <  Z7T. 

y(t)  =  R  sint, 

In  this  case,  t  can  be  interpreted  as  the  angle  between  the  position  vector  and  the 
positive  v-axis  (Fig.  14.1).  The  components  x  =  x(t),  y  =  y(t)  satisfy  the  quadratic 
equation 


2  i  2  /)  2 

x  +  y  =  R  ; 

however,  one  cannot  represent  the  circle  in  its  entirety  as  the  graph  of  a  function. 

Experiment  14.4  Open  the  M-file  matl4_l  .m  and  discuss  which  curve  is  being 
represented.  Compare  with  the  M-files  matl4_2  .  m  to  matl4_4  .  m.  Are  these  the 
same  curves? 

Experiment  14.4  suggests  that  one  can  view  curves  statically  as  a  set  of  points  in 
the  plane  or  dynamically  as  the  trajectory  of  a  moving  point.  Both  perspectives  are 
of  importance  in  applications. 


v 


Fig.  14.1  Parabolic  trajectory  and  circle 
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The  kinematic  point  of  view.  In  the  kinematic  interpretation,  one  considers  the 
parameter  t  of  the  curve  as  time  and  the  curve  as  path.  Different  parametrisations  of 
the  same  geometric  object  are  viewed  as  different  curves. 

The  geometric  point  of  view.  In  the  geometric  interpretation,  the  location,  the  mov¬ 
ing  sense  and  the  number  of  cycles  are  considered  as  the  defining  properties  of  a 
curve.  The  particular  parametrisation,  however,  is  irrelevant. 

A  strictly  monotonically  increasing,  continuous  mapping  of  an  interval  [a,  (3]  to 
[a,  b]. 


V  :  [a,  P\  [a,  b] 

is  called  a  change  of  parameter .  The  curve 

r  i  ^  £(t),  a  <  T  <  (3 
is  called  a  reparametrisation  of  the  curve 

t  i->  x(t),  a  <  t  <  b, 

if  it  is  obtained  through  a  change  of  parameter  t  =  <p(r)\  i.e., 

£(r)  =  x(ip(r)). 

From  the  geometric  point  of  view,  the  parametrised  curves  r  h-+  £(t)  and  t  i->  x(t) 
are  identified.  A  plane  curve  r  is  an  equivalence  class  of  parametrised  curves  which 
can  be  transformed  to  one  another  by  reparametrisation. 

Example  14.5  We  consider  the  segment  of  a  parabola,  parametrised  by 


r  :  x(t)  = 


-!<*<!. 


Reparametrisations  are  for  instance 

V  :  [-J,  \\  ->•  [-1,  1],  <p(t)  =  2t, 

:  [—1,  1]  —*■  [—1, 1],  $(t)  =  T3. 

Consequently, 


2  r 
4  T2 


—  i  <  J-  <  i 
2  -  T  -  2 


€(t)  = 


9 
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and 


V(t)  = 


T' 

T( 


—  1  <  T  <  1 


geometrically  represent  the  same  curve.  However, 

ip  :  [-1,  1]  [-1,  1],  tp(r)  =  -t, 

7  ■  [0,  1]  — >  [—1,  1],  7(r)  =  — 1  +  8t(1  —  t) 

are  not  reparametrisations  and  yield  other  curves,  namely 


—  1  <  t  <  1 , 


—  1  +  8r(l  —  r) 
(-l+8r(l  -r))2 


0  <  T  <  1. 


In  the  first  case,  the  moving  sense  of  r  is  reversed,  and  in  the  second  case,  the  curve 
is  traversed  twice. 


Experiment  14.6  Modify  the  M-files  from  Experiment  14.4  so  that  the  curves  from 
Example  14.5  are  represented. 

Algebraic  curves.  These  are  obtained  as  the  set  of  zeros  of  polynomials  in  two 
variables.  As  examples  we  had  already  parabola  and  circle 

y-x2  =0,  x2  +  y2  -  R2  =  0. 

One  can  also  create  cusps  and  loops  in  this  way. 

Example  14.7  Neil’s  parabola 


y2-X3  =0 

has  a  cusp  at  v  =  y  =  0  (Fig.  14.2).  Generally,  one  obtains  algebraic  curves  from 

y2  —  (x  +  p)x2  =  0,  p  G  R. 

For  p  >  0  they  have  a  loop.  A  parametric  representation  of  this  curve  is,  for  instance, 


2W.  Neil,  1637-1670. 


x(t)  =t2  -  p, 
y(t)  =  t{t2  -  p), 


—  00  <  t  <  00. 
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Fig.  14.2  Neil’s  parabola,  the  a  -curve  and  an  elliptic  curve 


In  the  following  we  will  primarily  deal  with  curves  which  are  given  by  differen¬ 
tiable  parametrisations. 

Definition  14.8  If  a  plane  curve  r  :  t  \-+  x(t)  has  a  parametrisation  whose  compo¬ 
nents  t  i->  x(t),  t  i->  y(t)  are  differentiable,  then  r  is  called  a  differentiable  curve. 
If  the  components  are  &-times  differentiable,  then  r  is  called  a  &-times  differentiable 
curve. 

The  graphical  representation  of  a  differentiable  curve  does  not  have  to  be  smooth 
but  may  have  cusps  and  corners,  as  Example  14.7  shows. 

Example  14.9  (Straight  line  and  half  ray)  The  parametric  representation 


t  i->  x(t)  = 


xo 

yo 


—  00  <  t  <  00 


describes  a  straight  line  through  the  point  xo  =  [xo,  yo]T  in  the  direction 
r  =  [ri,  r2]T.  If  one  restricts  the  parameter  t  to  0  <  t  <  oo  one  obtains  a  half  ray. 
The  parametrisation 


x#  (0  = 


*0 

i  *2 

n 

_yo_ 

+  t 

J2_ 

—  00  <  t  <  00 


leads  to  a  double  passage  through  the  half  ray. 

Example  14.10  (Parametric  representation  of  an  ellipse)  The  equation  of  an  ellipse 
is 


2  2 

xz  yz 

- f  —  =  1 

a2  b 2 
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Fig.  14.3  Parametric 
representation  of  the  ellipse 


y 


A  parametric  representation  (single  passage  in  counterclockwise  sense)  is  obtained 
by 


x(t)  =  a  cos  t, 
y(t)  =  b  sin  t, 


0  <  t  <  2tt. 


This  can  be  seen  by  substituting  these  expressions  into  the  equation  of  the  ellipse. 
The  meaning  of  the  parameter  t  can  be  seen  from  Fig.  14.3. 


Example  14. 1 1  (Parametric  representation  of  a  hyperbola)  The  hyperbolic  sine  and 
the  hyperbolic  cosine  have  been  introduced  in  Sect.  2.2.  The  important  identity 

cosh2  t  —  sinh2  t  =  1 


has  been  noted  there.  It  shows  that 


x(t)  =  a  cosh  t, 
y(t)  =  b  sinh  t, 


—  OO  <  t  <  00 


is  a  parametric  representation  of  the  right  branch  of  the  hyperbola 


2  2 
*  y  1 

^  b2  ’ 


which  is  highlighted  in  Fig.  14.4. 

Example  14.12  (Cycloids)  A  circle  with  radius  R  rolls  (without  sliding)  along  the 
v-axis.  If  the  starting  position  of  the  centre  M  is  initially  M  =  (0,  R ),  its  posi¬ 
tion  will  be  Mt  =  ( Rt ,  R)  after  a  turn  of  angle  t.  A  point  P  with  starting  position 
P  =  (0,  R  —  A)  thus  moves  to  Pt  =  Mt  —  ( A  sin  t,  A  cos  t). 


14.1  Parametrised  Curves  in  the  Plane 


191 


The  trajectory  of  the  point  P  is  called  a  cycloid.  It  is  parametrised  by 


x(t)  =  Rt  —  A  sin  t, 
y(t)  =  R  —  A  cos  t, 


—  OO  <  t  <  00. 


Compare  Fig.  14.5  for  the  derivation  and  Fig.  14.6  for  some  possible  shapes  of  cy¬ 
cloids. 


Definition  14.13  Let  r  :  t  i->  x(t)  be  a  differentiable  curve.  The  rate  of  change  of 
the  position  vector  with  regard  to  the  parameter  of  the  curve 


x(t)  =  lim 
0 


is  called  the  velocity  vector  at  the  point  x(t)  of  the  curve.  If  x(t)  /  0  one  defines  the 
tangent  vector 


1 

V*(02  +  y(t)2 


x(t) 

y(t ) 


and  the  normal  vector 


N(r)  =  .  1 

yittV  +  y(t)2 


-y(t) 

x(t ) 


Fig.  14.4  Parametric 
representation  of  the  right 
branch  of  a  hyperbola 


Fig.  14.5  Parametrisation  of 
a  cycloid 


►  x 
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Fig.  1 4.6  Cycloids  for 
A  =  R/2,  R,  3R/2 


Fig.  1 4.7  Velocity  vector, 
acceleration  vector,  tangent 
vector,  normal  vector 


of  the  curve.  The  pair  (T(t),  N(t))  is  called  moving  frame.  If  the  curve  r  is  twice 
differentiable  then  the  acceleration  vector  is  given  by 


m 

y(t) 


In  the  kinematic  interpretation  the  parameter  t  is  the  time  and  x(t)  the  velocity 
vector  in  the  physical  sense.  If  it  is  different  from  zero,  it  points  in  the  direction  of 
the  tangent  (as  limit  of  secant  vectors).  The  tangent  vector  is  just  the  unit  vector  of 
the  same  direction.  By  rotation  of  90°  in  the  counterclockwise  sense  we  obtain  the 
normal  vector  of  the  curve,  see  Fig.  14.7. 


Experiment  14.14  Open  the  Java  applet  Parametric  curves  in  the  plane.  Plot  the 
curves  from  Example  14.5  and  the  corresponding  velocity  and  acceleration  vectors. 
Use  the  moving  frame  to  visualise  the  kinematic  curve  progression. 


Example  14.15  For  the  parabola  from  Example  14.2  we  get 

x(t)  =  VH,  x(t)  =  0, 

y(t)  =  vv  -  gt,  y(t)  =  -g, 


1 


v2h  +  (vv 


VH 

Vv  -  gt 


N(0  = 


gt  -  Vy 
VH 


v2h  +  (vv  -  gt)2 
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1 4.2  Arc  Length  and  Curvature 

We  start  with  the  question  whether  and  how  a  length  can  be  assigned  to  a  curve 
segment.  Let  a  continuous  curve 


r  :  t  i->  x(t)  = 


x(t) 

y(t) 


a  <  t  <  b 


be  given.  For  a  partition  Z:  a  =  to  <  h  <  ■■■  <tn=b  of  the  parameter  interval  we 
consider  the  (inscribed)  polygonal  chain  through  the  points 


x(f0),x(fi),  ...,x(tn). 

The  length  of  the  largest  subinterval  is  again  denoted  by  @(Z).  The  length  of  the 
polygonal  chain  is 


"  r - 

Ln  =  z2fix^  1))2  +  OKO  -  yfc- 1))2- 

i  =  1 

Definition  14.16  (Curves  of  finite  length)  A  plane  curve  r  is  called  rectifiable  or  of 
finite  length  if  the  lengths  Ln  of  all  inscribed  polygonal  chains  Zn  converge  towards 
one  (and  the  same)  limit  provided  that  <P(Zn)  — >  0. 

Example  14.17  (Koch’s  snowflake)  Koch’s  snowflake  was  introduced  in  Sect.  9.1 
as  an  example  of  a  finite  region  whose  boundary  has  the  fractal  dimension 
d  =  log  4/  log  3  and  infinite  length.  This  was  proven  by  the  fact  that  the  boundary 
can  be  constructed  as  the  limit  of  polygonal  chains  whose  lengths  tend  to  infinity. 
It  remains  to  verify  that  the  boundary  of  Koch’s  snowflake  is  indeed  a  continuous, 
parametrised  curve.  This  can  be  seen  as  follows.  The  snowflake  of  depth  0  is  an 
equilateral  triangle,  for  instance  with  the  vertices  p1?  p2,  P3  e  R2.  Using  the  unit 
interval  [0,  1]  we  obtain  a  continuous  parametrisation 

I  Pi  +3f(p2  -  Pi),  0  <  t  < 

p2  +  (3/  -  l)(p3  -  p2), 
p3  +  (3?  -  2)(pj  -  p3),  l<t<  1. 

We  parametrise  the  snowflake  of  depth  1  by  splitting  the  three  intervals  [0,  i], 

[f  >  f  ] ,  [f ,  1]  into  three  parts  each  and  using  the  middle  parts  for  the  parametrisation 
of  the  inserted  next  smaller  angle  (Fig.  14.8).  Continuing  in  this  way  we  obtain  a 
sequence  of  parametrisations 

t  xo(0>  t  xi (t),  . . . ,  t  i->  xn(t),  . . . 
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Fig.  14.8  Parametrisation  of 
the  boundary  of  Koch’s 
snowflake 


This  is  a  sequence  of  continuous  functions  [0,  1]  — >  M2  which,  due  to  its  construction, 
converges  uniformly  (see  Definition  C.5).  According  to  Proposition  C.6  the  limit 
function 


x(t)  =  lim  xn(t),  t  e  [0,  1] 

n^oo 

is  continuous  (and  obviously  parametrises  the  boundary  of  Koch’s  snowflake). 

This  example  shows  that  continuous  curves  can  be  infinitely  long  even  if  the 
parameter  of  the  curve  only  varies  in  a  bounded  interval  [a ,  b] .  That  such  a  behaviour 
does  not  appear  for  differentiable  curves  is  shown  by  the  next  proposition. 

Proposition  14.18  (Length  of  differentiable  curves)  Every  continuously  differen¬ 
tiable  curve  t  i->  x(t),  t  e  [a,  b]  is  rectifiable.  Its  length  is 

L  =  f  || x(f )  ||  df  =  f  ffff  +  y(t)2dt. 

J  a  J  a 

Proof  We  only  give  the  proof  for  the  somewhat  simpler  case  that  the  components  of 
the  velocity  vector  x(t)  are  Lipschitz  continuous  (see  Appendix  C.4),  for  instance 
with  a  Lipschitz  constant  C.  We  start  with  a  partition  Z:  a  =  to  <  t\  <  •  •  •  <  tn  =  b 
of  [ a ,  b]  with  corresponding  0(Z).  The  integral  defining  L  is  the  limit  of  Riemann 
sums 


/ 

J  a 


x(t)2  +  y(t)2  d t 


n- 


lim 

oo,<Z>(Z)^0 


U  / - 

Z!v*(r(')2  + 

i= 1 


where  r,  e  \b  _  i ,  b  ] .  On  the  other  hand,  according  to  the  mean  value  theorem,  Propo¬ 
sition  8.4,  the  length  of  the  inscribed  polygonal  chain  through  x(fo),  x(*i),  . . . ,  x(tn) 
is  equal  to 


Y  (x(ti)  -  x(tj- 1))2  +  (y(ti)  -  y(ti- 1))2 

i= 1 

"  r - 

=  Yl  fi(Pi)2  +  yOi)2  0 h  -  U- 1) 
i= 1 
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for  certain  pi,  07  e  \U- t{].  In  order  to  be  able  to  estimate  the  difference  between 
the  Riemann  sums  and  the  lengths  of  the  inscribed  polygonal  chains,  we  use  the 
inequality  (triangle  inequality  for  vectors  in  the  plane) 


V a2  +  b2  —  V c2  +  d2 


<  V (a  —  c )2  +  (b  —  d)2, 


which  can  be  checked  directly  by  squaring.  Applying  this  inequality  shows  that 


yi(r,)2  +  j(r,)2  -  y[Ufii)2  +  y( cr()2 

<  yj (i(T;)  -  x(pi))2  +  (y (t( )  -  y(a ;))2 

<  y C2(r,-  -  p,  )2  +  C2(r,-  -  <t;)2 

<  V2C<J>(Z). 

For  the  difference  between  the  Riemann  sums  and  the  lengths  of  the  polygonal  chains 
one  obtains  the  estimate 

n  , -  . - 

|  X)  ( y/x(Ti )2  +  y(r;)2  -  yjx(pi)2  +  y(<Ji)2}(ti  -  f,_i) 

i  =  \ 

n 

<  VlC0{Z)  -  ti- 1)  =  V2 C<P(Z)(b  -  a). 

i  =  1 

For  0  (Z)  — >  0,  this  difference  tends  to  zero.  Thus  the  Riemann  sums  and  the  lengths 
of  the  inscribed  polygonal  chains  have  the  same  limit,  namely  L. 

The  proof  of  the  general  case,  where  the  components  of  the  velocity  vector  are 
not  Lipschitz  continuous,  is  similar.  However,  one  additionally  needs  the  fact  that 
continuous  functions  on  bounded,  closed  intervals  are  uniformly  continuous.  This  is 
briefly  addressed  near  the  end  of  Appendix  C.4.  □ 


Example  14.19  (Length  of  a  circular  arc)  The  parametric  representation  of  a  circle 
of  radius  R  and  its  derivative  is 


x(t)  =  Rcost,  x(t)  =  —  Rsint, 
y(t)  =  Rsint,  y(t)  =  Rcost, 


0  <  t  <  2tt. 


The  circumference  of  the  circle  is  thus 


R  sin  t )2  +  ( R  cos  t )2  d t 


2n 

Rdt  =  2Rtt. 
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Experiment  14.20  Use  the  MATLAB  program  matl4_5  .m  to  approximate  the  cir¬ 
cumference  of  the  unit  circle  using  inscribed  polygonal  chains.  Modify  the  program 
so  that  it  approximates  the  lengths  of  arbitrary  differentiable  curves. 

Definition  14.21  (Arc  length)  Let  t  \-^  x(t)  be  a  differentiable  curve.  The  length 
of  the  curve  segment  from  the  initial  parameter  value  a  to  the  current  parameter  value 
t  is  called  the  arc  length , 


S  =  L(t)  =  f  f(r)2  +  y(r)2dr. 

J  a 

The  arc  length  s  is  a  strictly  monotonically  increasing,  continuous  (even  continu¬ 
ously  differentiable)  function.  It  is  thus  suitable  for  a  reparametrisation  t  =  L~l(s). 
The  curve 


S  £(s)  =  X(L  V)) 


is  called  parametrised  by  arc  length . 

In  the  following  let  t  i->  x(t)  be  a  differentiable  curve  (in  the  plane).  The  angle 
of  the  tangent  vector  with  the  positive  v-axis  is  denoted  by  p(t);  that  is, 


tan  p(t)  = 


y(t ) 

x(t) ' 


Definition  14.22  (Curvature  of  a  plane  curve)  The  curvature  of  a  differentiable 
curve  in  the  plane  is  the  rate  of  change  of  the  angle  p  with  respect  to  the  arc  length, 


K  = 


dp 

ds 


d 

ds 


(p(L  V)). 


Figure  14.9  illustrates  this  definition.  If  p  is  the  angle  at  the  length  s  of  the  arc 
and  p  +  A p  the  angle  at  the  length  s  +  As,  then  n  =  limAs^o  This  shows  that 
the  value  of  n  actually  corresponds  to  the  intuitive  meaning  of  curvature.  Note  that 
the  curvature  of  a  plane  curve  comes  with  a  sign;  when  reversing  the  moving  sense, 
the  sign  changes. 


Proposition  14.23  The  curvature  of  a  twice  continuously  differentiable  curve  at  the 
point  (x(t),  y(t))  of  the  curve  is 


x(t)y(t)-y(t)x(t) 
(x(t)2  +  y(t  )2)3/2 
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Fig.  14.9  Curvature 


Proof  According  to  the  chain  rule  and  the  inverse  function  rule,  one  gets 


K=ftfi(L  l(s))  =  if(L  l(s)  =  tp(L  V))  •  •  1 

L(L  Us)) 


d  s 


ds 


Differentiating  the  arc  length 


s  =  L(t)  =  f  fx (t)2  +  y (t)2  dr 

J  CL 


with  respect  to  t  gives 

=  L(t)  =  fit)2  +  y(t)2. 

Differentiating  the  relationship  tan  cp(t)  =  y(t)/x(t)  leads  to 


<p(t)(  1  +  tan2  <P(0) 


x(t)2 


which  gives,  after  substituting  the  above  expression  for  tan  <p(t)  and  simplifying, 


mm  -  mm 

x(t)2  +  y(t)2 


If  one  takes  into  account  the  relation  t  =  L  Us)  and  substitutes  the  derived  expres- 
sions  for  Cp(t)  and  L(t)  into  the  formula  for  n  at  the  beginning  of  the  proof,  one 
obtains 


m  =  x(t)y(t)-y(t)x(t) 
L(t)  ( x(t )2  +  y(02)3/2 


which  is  the  desired  assertion. 


□ 
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Remark  14.24  As  a  special  case,  the  curvature  of  the  graph  of  a  twice  differentiable 
function  y  =  f(x)  can  be  obtained  as 


rc(x)  = 


f"(x) 

(1  +  fix)2)312  ' 


This  follows  easily  from  the  above  proposition  by  using  the  parametrisation  x  =  t, 

y  =  fit)- 


Example  14.25  The  curvature  of  a  circle  of  radius  R ,  traversed  in  the  positive  direc¬ 
tion,  is  constant  and  equal  to  n  =  .  Indeed 


x(t)  =  R  cos  t, 
y(t)  =  R  sin  t, 


x(t)  =  —R  sin  t, 
y(t)  =  R  cos  t, 


x(t)  =  —R  cos  t, 
y(t)  =  —R  sin  t, 


and  thus 


R2  sin2  t  +  R2  cos2  t  1 
( R 2  sin2  t  +  R2  cos2  t)3/2  R 

One  obtains  the  same  result  from  the  following  geometric  consideration.  At  the  point 
(x,  y)  =  ( R  cos  t,  R  sin  t)  the  angle  p  of  the  tangent  vector  with  the  positive  v-axis 
is  equal  to  t  +  tt/2,  and  the  arc  length  is  s  =  Rt.  Therefore  p  =  s / R  +  tt/2  which 
differentiated  with  respect  to  s  gives  n  =  l/R. 

Definition  14.26  The  osculating  circle  at  a  point  of  a  differentiable  curve  is  the 
circle  which  has  the  same  tangent  and  the  same  curvature  as  the  curve. 

According  to  Example  14.25  it  follows  that  the  osculating  circle  has  the  radius 
j^jy  and  its  centre  xc(t)  lies  on  the  normal  of  the  curve.  It  is  given  by 

1 

Xc(0  =  x(0  +  ^rN(0. 

n(t) 

Example  14.27  (Clothoid)  The  clothoid  is  a  curve  whose  curvature  is  proportional 
to  its  arc  length.  In  applications  it  serves  as  a  connecting  link  from  a  straight  line  (with 
curvature  0)  to  a  circular  arc  (with  curvature  -j^).  It  is  used  in  railway  engineering 
and  road  design.  Its  defining  property  is 


,  s  dip 

kA  s)  =  —  =  c  -  s 
d  s 


for  a  certain  c  e  R.  If  one  starts  with  curvature  0  at  s  =  0  then  the  angle  is  equal  to 


14.2  Arc  Length  and  Curvature 
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Fig.  14.10  Clothoid 


We  use  s  as  the  curve  parameter. 
Differentiating  the  relation 


shows  that 


/t(.)2  +  Kc)2  da 


1  =  -Jx(. s)2  +  y(s)2; 

thus,  the  velocity  vector  of  a  curve  parametrised  by  arc  length  has  length  one.  This 
implies  in  particular 

dx  d  y 

—  =  cos  ip(s),  — =  sirup(s). 
ds  ds 

From  there  we  can  compute  the  parametrisation  of  the  curve: 


r  dx 

fs 

fS  (C 

)  da, 

x(s)  = 

/  ,  (o')  da  - 

Jo  ds 

J  cos  (f(cr)  dcr  =  j 

0  C0S  \2a  > 

J(s)  = 

fs  d y 

/  ,  (cr)  da  = 

Jo  ds 

J  sin  cp  (a)  da  =  J 

f  Si„(i»2) 

da. 

The  components  of  the  curve  are  thus  given  by  Fresnel’s  integrals.  The  shape  of  the 
curve  is  displayed  in  Fig.  14.10,  its  numerical  calculation  can  be  seen  in  the  MATLAB 
program  mat  14_6  .m. 
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1 4.3  Plane  Curves  in  Polar  Coordinates 

By  writing  the  parametric  representation  in  the  form 

x(t)  =  r(t )  cos  t, 
y(t)  =  r(t)  sin  t 

in  polar  coordinates  with  t  as  angle  and  r(t)  as  radius,  one  obtains  a  simple  way 
of  representing  many  curves.  By  convention  negative  radii  are  plotted  in  opposite 
direction  of  the  ray  with  angle  t. 

Example  14.28  (Spirals)  The  Archimedean^  spiral  is  defined  by 

r(t)  =  t,  0  <  t  <  oo, 

the  logarithmic  spiral  by 

r(t)  =  er,  —oo  <  t  <  oo, 

the  hyperbolic  spiral  by 

1 

r(t)  =  -,  0  <  t  <  oo. 

Typical  parts  of  these  spirals  are  displayed  in  Fig.  14.11. 

Experiment  14.29  Study  the  behaviour  of  the  logarithmic  spiral  near  the  origin 
using  the  zoom  tool  (use  the  M-file  matl4_7  .  m). 

Example  14.30  (Loops)  Loops  are  obtained  by  choosing  r(t)  =  cos  nt,  n  e  N.  In 
Cartesian  coordinates  the  parametric  representation  thus  reads 

x(t)  =  cos nt  cos  t, 
y(t)  =  cos  nt  sin  t. 


Fig.  14.11  Archimedean,  logarithmic  and  hyperbolic  spirals 


3  Archimedes  of  Syracuse,  287-212  B.C. 


14.3  Plane  Curves  in  Polar  Coordinates 
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Fig.  1 4.1 2  Loops  with  r  =  cos  t  and  r  =  cos  2 1 


Fig.  14.1 3  Loops  with  r  —  cos  3 1  and  r  =  ±Vcos  It 


The  choice  n  =  1  results  in  a  circle  of  radius  \  about  (^,  0),  for  odd  n  one  obtains 
n  leaves,  for  even  n  one  obtains  In  leaves,  see  Figs.  14.12  and  14.13. 

The  figure  eight  from  Fig.  14.13  is  obtained  by  r(t)  =  Vcos  2 1  and 
r(t)  =  — Vcos  2 1,  respectively,  for  —  ^  <  t  <  J,  where  the  positive  root  gives  the 
right  leave  and  the  negative  root  the  left  leave.  This  curve  is  called  lemniscate. 

Example  14.31  (Cardioid)  The  cardioid  is  a  special  epicycloid,  where  one  circle  is 
rolling  around  another  circle  with  the  same  radius  A.  Its  parametric  representation  is 

x(t )  =  2A  cos  t  +  A  cos  2 1, 
y(t )  =  2A  sin  t  +  A  sin  2 1 

for  0  <  t  <  27 r.  The  cardioid  with  radius  A  =  1  is  shown  in  Fig.  14.14. 


Fig.  14.14  Cardioid  with  A  =  1 
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1 4.4  Parametrised  Space  Curves 


In  the  same  way  as  for  plane  curves,  a  parametrised  curve  in  space  is  defined  as  a 
continuous  mapping  of  an  interval  [a ,  b ]  to  M3 , 


t  i->  x(t)  = 


x(t) 

y(t) 

z(t) 


a  <  t  <  b. 


The  curve  is  called  differentiable ,  if  all  three  components  t  i->  x(t), 
t  i->  y(t),  t  \-+  z(t)  & re  differentiable  real- valued  functions. 

Velocity  and  tangent  vector  of  a  differentiable  curve  in  space  are  defined  as  in  the 
planar  case  by 


1 

V*(02  +  y(f)2  +  z(02 


x(t) 

yit) 

m 


The  second  derivative  x(t)  is  the  acceleration  vector.  In  the  spatial  case  there  is  a 
normal  plane  to  the  curve  which  is  spanned  by  the  normal  vector 


N(f)  = 


1 

l|T(OII 


T  (t) 


and  the  binormal  vector 


B(0  =  T (0  x  N(0, 

provided  that  x(t)  /  0,  T (t)  /  0.  The  formula 


d 
d  t 


||T(0||2  =  2(T(/),T(0> 


(which  is  verified  by  a  straightforward  computation)  implies  that  T(t)  is  perpendic¬ 
ular  toT(0- Therefore,  the  three  vectors  (T(f),  N(0,  B(f))  form  an  orthogonal  basis 
in  M3,  called  the  moving  frame  of  the  curve. 

Rectifiability  of  a  curve  in  space  is  defined  in  analogy  to  Definition  14. 16  for  plane 
curves.  The  length  of  a  differentiable  curve  in  space  can  be  computed  by 


L  =  f  ||x(f)||df 


f  fit)1  +  y(t)2  +  z(t )2  df 

J  a 


Also,  the  arc  length  can  be  defined  similarly  to  the  planar  case  (Definition  14.21). 


1 4.4  Parametrised  Space  Curves 
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Fig.  14.1 5  Helix  with  tangent,  normal  and  binormal  vector 


Example  14.32  (Helix)  The  parametric  representation  of  the  helix  is 


cos  t 
sin  t 
t 


—  00  <  t  <  00. 


We  obtain 


—  sin  t 

T(r)  -  ' 

—  sin  t 

cos  t 

cos  t 

1 

72 

1 

1 

—  COS  t 

—  COS  t 

T(f)  =  -= 

—  sin  t 

,  N(r)  = 

—  sku 

72 

0 

0 

with  binormal  vector 


—  sin  t 

—  cos  t 

1 

sin  t 

cos  t 

X 

—  sku 

—  cos  t 

1 

0 

72 

1 

The  formula  for  the  arc  length  of  the  helix,  counting  from  the  origin,  is  particularly 
simple: 
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Figure  14.15  was  drawn  using  the  MATLAB  commands 

t  =  0  :  pi/100  :  6  *  pi  ; 
plot3(cos(t)  ,  sin(t)  ,  t/10). 

The  Java  applet  Parametric  curves  in  space  offers  dynamic  visualising  possibilities 
of  those  and  other  curves  in  space  and  of  their  moving  frames. 


14.5  Exercises 

1 .  Find  out  which  geometric  formation  is  represented  by  the  set  of  zeros  of  the  poly¬ 
nomial  y2  —  x(x2  —  1)  =  0.  Visualise  the  curve  in  maple  using  the  command 
implicitplot.  Can  you  parametrise  it  as  a  continuous  curve? 

2.  Verify  that  the  algebraic  curves  y2  —  (x  +  p)x2  =  0,  p  e  R  (Example  14.7)  can 
be  parametrised  by 


x(t)  =  t2  -  p, 

y(t)  =  t(t 2  -  p ), 


—  00  <  t  <  00. 


Visualise  the  curves  for  p  =  — 1,0,1  in  maple  using  the  command 
implicitplot. 

3.  Using  MATLAB  or  maple,  investigate  the  shape  of  Lissajous  figures" 


x(t)  =  sin(u;p),  y(t)  =  cos(w2t) 


and 


x(t)  =  sin(wit),  y(t)  =  cos  (^W2t  -f — j. 


o 

Consider  the  cases  u>2  =  w 2  =  2w\,  W2  =  \ w\  and  explain  the  results. 

The  following  exercises  use  the  Java  applets  Parametric  curves  in  the  plane 
and  Parametric  curves  in  space. 

4.  (a)  Using  the  Java  applet  analyse  where  the  cycloid 


x(t)  =  t  —  2  sin 
y(t)  =  1  —  2  cos  t, 


—  2tt  <  t  <  2tt. 


has  its  maximal  speed  (|| x(t)  ||  max)  and  check  your  result  by  hand. 


4J.A.  Lissajous,  1822-1880. 
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(b)  Discuss  and  explain  the  shape  of  the  loops 


x(t)  =  cos  nt  cos  t, 
y(t)  =  cos  nt  sin  t, 


0  <  t  <  27 r. 


for  n  =  1,  2,  3,  4,  5  using  the  Java  applets  (plot  the  moving  frame). 

5.  Study  the  velocity  and  the  acceleration  of  the  following  curves  by  using  the  Java 
applet.  Verify  your  results  by  computing  the  points  where  the  curve  has  either  a 
horizontal  tangent  (x(t)  /  0,  y(t)  =  0)  or  a  vertical  tangent  (x(t)  =  0,  y(t)  /  0), 
or  is  singular  (i(£)  =  0,  y(r)  =  0). 

(a)  Cycloid: 


x(t)  =  /  —  sin  t, 
y(t)  =  1  —  cos  t, 


—  2i t  <  t  <  2tt. 


(b)  Cardioid: 


x(t)  =  2  cos  t  +  cos  2 1, 


y(t)  =  2  sin  t  +  sin  2 1, 

6.  Analyse  and  explain  the  trajectories  of  the  curves 


0  <  t  <  27t. 


x(r)  = 


y  (0  = 


Z  (t)  = 


1-2 12 
(1  -  2 12)2 


,  -1  <  t  <  1, 


COS  t 

cos2  t 


,  0  <  t  <  2tt, 


t  cos  t 
t2  cos2  t 


,  —2<t<2. 


Are  these  curves  (geometrically)  equivalent? 

7.  (a)  Compute  the  curvature  K(t)  of  the  branch  of  the  hyperbola 


x(t)  =  cosfu, 
y(t)  =  sinh  t, 


—  OO  <  t  <  00. 


(b)  Determine  its  osculating  circle  (centre  and  radius)  at  t  =  0. 

8.  Consider  the  ellipse 


2  cos  t 
sin  t 


—  7 r  <  t  <  7T. 


(a)  Compute  its  velocity  vector  x(t),  its  acceleration  vector  x(t)  as  well  as  the 
moving  frame  (T(t),  N(t)). 

(b)  Compute  its  curvature  ti(t)  and  determine  the  osculating  circle  (centre  and 
radius)  at  t  =  0. 
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9.  (a)  Analyse  the  trajectory  of  the  astroid 


/  \  cos3 1  „  0 

x(t)  =  .a  ,  0  <  t  <  2tt. 

snr  t 


(b)  Compute  the  length  of  the  part  of  the  astroid  which  lies  in  the  first  quadrant. 

10.  (a)  Compute  the  velocity  vector  x(t)  and  the  moving  frame  (T(t),  N(t))  for  the 

segment 

/x  ercost~|  _  .. 

x(t)  =  t  .  ,  0  <  t  <  7t/2 

&  sintj  -  -  / 

of  the  logarithmic  spiral.  At  what  point  in  the  interval  [0,  tt/2]  does  it  have 
a  vertical  tangent? 

(b)  Compute  the  length  of  the  segment.  Deduce  a  formula  for  its  arc  length 
s  =  L(t). 

(c)  Reparametrise  the  spiral  by  its  arc  length,  i.e.,  compute  £(s)  =  x(L_1(^)) 
and  verify  that  ||£(s,)||  =  1. 

11.  (Application  of  the  secant  and  cosecant  functions)  Analyse  what  plane  curves 
are  determined  in  polar  coordinates  by 

r(t)  =  sect,  — 7r/2  <  t  <  7r/2  and  r(t)  =  esc t,  0  <  t  <  tt. 

12.  (a)  Determine  the  tangent  and  the  normal  to  the  graph  of  the  function  y  =  l/x 

at  (vo,  yo)  =  (1,  1)  and  compute  its  curvature  at  that  point. 

(b)  Suppose  the  graph  of  the  function  y  =  1  /x  is  to  be  replaced  by  a  circular 
arc  at  xo,  i.e.,  for  v  >  1.  Find  the  centre  and  the  radius  of  a  circle  which 
admits  a  smooth  transition  (same  tangent,  same  curvature). 

13.  (a)  Analyse  the  space  curve 


cos  t 

x(t)  =  sin  t  ,  0  <  t  <  4tt 

2  sin  ^ 


using  the  applet. 

(b)  Check  that  the  curve  is  the  intersection  of  the  cylinder  x2  +  y2 
sphere  (x  +  l)2  +  y2  +  z2  —  4. 


=  1  with  the 


Hint.  Use  sin2  ^  =  ^(1  —  cost). 

14.  Using  MATLAB,  maple  or  the  applet,  sketch  and  discuss  the  space  curves 


t  cos  t 

x(t)  =  tsint  ,  0  <  t  <  oo, 

It 
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and 


cos  t 
sin  t 
0 


0  <  t  <  4tt. 


15.  Sketch  and  discuss  the  space  curves 


1 

1 _ 

1 

1 _ 

Of. 

II 

1 

_ 1 

V- 

OK 

II 

1 

Cl  CC 

_ 1 

Compute  their  velocity  vectors  x(7),  y (t)  and  their  acceleration  vectors  x(7), 

y(0- 

16.  Sketch  the  space  curve 


cosh  t 
cosh  t 


0  <  t  <  1. 


Compute  its  moving  frame  (T(7),  N(7),  B(t))  as  well  as  its  length. 

17.  Sketch  the  space  curve 


cos  t 
sint 


0  <  t  <  2tt, 


and  compute  its  length. 


® 

Check  for 
updates 


Scalar-Valued  Functions  of  Two 
Variables 


This  chapter  is  devoted  to  differential  calculus  of  functions  of  two  variables.  In 
particular  we  will  study  geometrical  objects  such  as  tangents  and  tangent  planes, 
maxima  and  minima,  as  well  as  linear  and  quadratic  approximations.  The  restriction 
to  two  variables  has  been  made  for  simplicity  of  presentation.  All  ideas  in  this  and  the 
next  chapter  can  easily  be  extended  (although  with  slightly  more  notational  effort) 
to  the  case  of  n  variables. 

We  begin  by  studying  the  graph  of  a  function  with  the  help  of  vertical  cuts  and 
level  sets.  As  a  further  tool  we  introduce  partial  derivatives,  which  describe  the  rate 
of  change  of  the  function  in  the  direction  of  the  coordinate  axes.  Finally  the  notion 
of  the  Frechet  derivative  allows  us  to  define  the  tangent  plane  to  the  graph.  As  for 
functions  of  one  variable  the  Taylor  formula  plays  a  central  role.  We  use  it,  e.g.,  to 
determine  extrema  of  functions  of  two  variables. 

In  the  entire  chapter  D  denotes  a  subset  of  M2,  and 

/  :  D  c  M2  -*  K  :  (x,  y)  h*  z  =  f(x,  y) 

denotes  a  scalar-valued  function  of  two  variables.  Details  of  vector  and  matrix  alge¬ 
bra  used  in  this  chapter  can  be  found  in  Appendices  A  and  B. 


1 5.1  Graph  and  Partial  Mappings 

The  graph 

G  =  {(x,  y,  z)  e  D  x  R  ;  z  =  fix,  y)}  C  M3 
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15  Scalar- Valued  Functions  of  Two  Variables 


Fig.  15.1  Graph  of  a  function  as  surface  in  space  with  coordinate  curves  (left)  and  level  curve  Nc 
(right) 


of  a  function  of  two  variables  /  :  D  — >  R  is  a  surface  in  space,  if  /  is  sufficiently 
regular.  To  describe  the  properties  of  this  surface  we  consider  particular  curves  on  it. 
The  partial  mappings 


x  h*  f(x,  b),  y  h*  f{a,  y) 

are  obtained  by  fixing  one  of  the  two  variables  y  =  b  or  x  =  a.  The  partial  mappings 
can  be  used  to  introduce  the  space  curves 


V 

a 

b 

>  y 

y 

J(x,  b)_ 

y)_ 

These  curves  lie  on  the  graph  G  of  the  function  and  are  called  coordinate  curves. 
Geometrically  they  are  obtained  as  the  intersection  of  G  with  the  vertical  planes 
y  =  b  and  x  =  a  respectively,  see  Fig.  15.1,  left. 

The  level  curves  are  the  projections  of  the  intersections  of  the  graph  G  with  the 
horizontal  planes  z  =  c  to  the  (jc,  y)-plane, 

Nc  =  {(x,  y)  e  D  ;  f(x,  y)  =  c}, 

see  Fig.  15.1,  right.  The  set  Nc  is  called  level  curve  at  level  c. 

Example  15.1  The  graph  of  the  quadratic  function 

2  2 

/  :  ]R2  -*  R  :  (x,  y)  \->  z  =  V  - 

aA 

describes  a  surface  in  space  which  is  shaped  like  a  saddle  and  called  hyperbolic 
paraboloid.  Figure  15.2  shows  the  graph  of  z  =  x2/4  —  y2/ 5  with  coordinate  curves 
(left)  as  well  as  some  level  curves  (right). 


15.1  Graph  and  Partial  Mappings 
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Fig.  15.2  The  picture  on  the  left  shows  the  graph  of  the  function  z  =  x2  / 4  —  y2/ 5  with  coordinate 
curves.  Furthermore,  it  shows  the  intersections  with  the  planes  z  =  c  for  selected  values  of  c.  The 
picture  on  the  right  illustrates  the  level  curves  of  the  function  for  the  same  values  of  c  (lower  levels 
correspond  to  thicker  lines).  The  two  intersecting  straight  lines  are  the  level  curves  at  level  c  —  0 


Experiment  15.2  With  the  help  of  the  MATLAB  program  matl5_l .  m  visualise  the 
elliptic  paraboloid  z  =  x2  +  2 y2  —  4x  +  1.  Choose  a  suitable  domain  D  and  plot 
the  graph  and  some  level  curves. 


15.2  Continuity 

Like  for  functions  in  one  variable  (see  Chap.  6)  we  characterise  the  continuity  of 
functions  of  two  variables  by  means  of  sequences.  Thus  we  need  the  concept  of 
convergence  of  vector- valued  sequences. 

Let  (aw)w>i  =  (ai,  a2,  a3, ...)  be  a  sequence  of  points  in  D  with  terms 

a n  ~  5  )  G  Z)  CZ  K.  . 

The  sequence  (aw)w>i  is  said  to  converge  to  a  =  (a,  b)  6  D  as  n  — >  oo,  if  and  only 
if  both  components  of  the  sequence  converge,  i.e. 

lim  an  =  a  and  lim  bn  —  b. 

n—>oc  n^-oo 


This  is  denoted  by 

(i an ,  bn)  —  a/?  ^  a  =  (a,  b)  as  ft  — oo  or  lim  an  =  a. 

n^-oo 

Otherwise  the  sequence  is  called  divergent. 
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Fig.  15.3  A  function  which 
is  discontinuous  along  a 
straight  line.  For  every 
sequence  (a n)  which 
converges  to  a,  the  images  of 
the  sequence  (/( an)) 
converge  to  /(a).  For  the 
point  b,  however,  this  does 
not  hold;  /  is  discontinuous 
at  that  point 


An  example  of  a  convergent  vector- valued  sequence  is 


lim 

n^o o 


1  2  n  \ 
n  3n  +  4  / 


Definition  15.3  A  function  /  :  D  ->  M  is  called  continuous  at  the  point  a  e  Z),  if 

lim  /(a„)  =  /(a) 

n^oo 

for  all  sequences  (aw)w>i  which  converge  to  a  in  D. 

For  continuous  functions,  the  limit  and  the  function  sign  can  be  interchanged. 
Figure  15.3  shows  a  function  which  is  discontinuous  along  a  straight  line  but  contin¬ 
uous  everywhere  else. 


1 5.3  Partial  Derivatives 


The  partial  derivatives  of  a  function  of  two  variables  are  the  derivatives  of  the  partial 
mappings. 


Definition  15.4  Let  D  c  Mr  be  open,  /  :  D  -r  Rand  a  =  (a,  b)  e  D.  The  function 
/  is  called  partially  differentiable  with  respect  to  x  at  the  point  a,  if  the  limit 


df 

ff-(a,b)  =  lim 

OX  x^a 


f(x,b )  -  f(a,b) 
x  —  a 


exists.  It  is  called  partially  differentiable  with  respect  to  y  at  the  point  a,  if  the  limit 


df 

-^-{a,b)  =  lim 
oy  y^b 


f(a,y)  -  f(a,b ) 
y  -  b 
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Fig.  15.4  Geometric 
interpretation  of  partial 
derivatives 


exists.  The  expressions 


df  df 

( a,b )  and  — —  ( a,b ) 


dx  dy 

are  called  partial  derivatives  of  /  with  respect  to  v  and  y,  respectively,  at  the  point 
(a,  b).  Further  /  is  called  partially  differentiable  at  a,  if  both  partial  derivatives  exist. 


Another  notation  for  partial  derivatives  at  the  point  (x,  y)  is 

df  d 

d~(x’  y)  =  y )  =  d\  f(x,  y ) 

OX  OX 


and  likewise 


d 

dy 


f{x,  y)  =  dif(x,  y). 


Geometrically,  partial  derivatives  can  be  interpreted  as  slopes  of  the  tangents  to 
the  coordinate  curves  x  i->  [x,  b ,  f(x,  b)]J  and  y  h->  [a,  y,  f(a ,  y)]T,  see.  Fig.  15.4. 

The  two  tangent  vectors  v  and  w  to  the  coordinate  curves  at  the  point  (a,b,  f  (a,  b)) 
can  therefore  be  represented  as 


r  1  i 

0 

0 

1 

df 

,  w  = 

df 

,,  (a,b) 

L  dx  -1 

„  (a,b) 
dy  J 

Since  partial  differentiation  is  nothing  else  but  ordinary  differentiation  with  respect 
to  one  variable  (while  fixing  the  other  one),  the  usual  rules  of  differentiation  apply, 
e.g.  the  product  rule 


d 


fix,  y)  ■  g(x,  y))  =  y~(x,  y)  ■  g(x,  y)  +  fix,  y)  ■  ‘fix,  y) 


dg 


dy 


dy 


dy 
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Fig.  15.5  Partially 
differentiable,  discontinuous 
function 


Example  15.5  Let  r  :  M2  ->  R  :  (x,  y)  i->  yj x2  +  y2.  This  function  is  everywhere 
partially  differentiable  with  the  exception  of  (x,  y)  =  (0,  0).  The  partial  derivatives 
are 

dr  1  2x  x  dr  1  2y  y 

— —  (x,  y)  = - .  =  - ,  (v,  y)  = - .  =  - . 

dx  2  yx2  _|_  y2  r(x,  y)  dy  2  jx2  +  y 2  r(x,y) 

In  maple  one  can  use  the  commands  dif  f  and  Dif  f  in  order  to  calculate  partial 
derivatives,  e.g.  in  the  above  example: 

r : =sqrt (x"2+y"2 ) ; 
dif f (r , x) ; 


Remark  15.6  In  contrast  to  functions  in  one  variable  (see  Application  7.16),  partial 
differentiability  does  not  imply  continuity 

/  partially  differentiable  ^  /  continuous. 


An  example  is  given  by  the  function  (see  Fig.  15.5) 


f(x,y) 


xy 

x2  +  y2’ 

0, 


(x,y)^(0,  0 ), 
(x,  y)  =  (0,  0). 


This  function  is  everywhere  partially  differentiable.  In  particular,  at  the  point  (x ,  y )  = 
(0,  0)  one  obtains 


(0, o,  =  Ita  =  0  =  lim  WjHM  =  Of  0) 

dx  x^o  x  y^o  y  dy 

However,  the  function  is  discontinuous  at  (0,  0).  In  order  to  see  this,  we  choose  two 
sequences  which  converge  to  (0,  0): 


a 


=  (f  ii)  and  Cn  =  (f 


1  1 


n 
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We  have 


but  also 


1  /nA 


lim  /( a„)  =  lim 

n^oo  n — > oo  Z  nL 


1 

2’ 


—  1  In 

lim  /( c„)  =  lim  — — T 

/?  — >  OO  n^oo  Z  nL 


1 

2 


The  limits  do  not  coincide,  in  particular,  they  differ  from  /( 0,  0)  =  0. 


Experiment  15.7  Visualise  the  function  given  in  Remark  15.6  with  the  help  of 
MATLAB  and  maple.  Using  the  command 

plot3d ( -x*y/ (x"2+y"2 ) ,  x=-l..l,  y=-l..l,  shading=zhue) ; 
the  corresponding  plot  can  be  obtained  in  maple . 

Higher-order  partial  derivatives.  Let  D  c  Mr  be  open  and  /  :  D  — >  R  partially 
differentiable.  The  assignments 

df  df 

— —  :  D  ->  R  and  — —  :  D  R 

dv  dy 

define  themselves  scalar- valued  functions  of  two  variables.  If  these  functions  are  also 
partially  differentiable,  then  /  is  called  twice  partially  differentiable.  The  notation 
in  this  case  is 


d2f  _  d  /df 

dx2  dx  V  dx 


d2f  _  d  (df 

dydx  dy 


etc. 


Note  that  there  are  four  partial  derivatives  of  second  order. 

Definition  15.8  A  function  /  :  D  R  is  k-times  continuously  ( partially )  differ¬ 
entiable ,  denoted  /  g  Ck(D ),  if  /  is  k-times  partially  differentiable  and  all  partial 
derivatives  up  to  order  k  are  continuous. 

2 

Example  15.9  The  function  f(x,y )  =  eAV  is  arbitrarily  often  partially  differen¬ 
tiable,  /  G  C°°(D ),  and  the  following  holds 


df 

dx 

df 


( x,j)  =  e*A2, 


(x,  y )  =  txr2 xy, 


dy 
dx 2 


(x,  y) 


=  exy2y4, 
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d2  f 

d^(x'y)  =  e 


d2f 

dydx 


(x,y)  = 


&f 

dxdy 


(x,  y)  = 


Xy2(4x2y2  +  2x), 


d  (df_ 
dx 

df 


dy 

d 


dx  \  dy 


(x,  y) 


(x,  y) 


=  exy2(2xy3  +  2y), 
=  exy2(2xy3+2y). 


The  identity 


a2/ 

dydx 


(x,y) 


dxdy 


(x,  y ) 


which  is  evident  in  this  example  is  generally  valid  for  twice  continuously  differen¬ 
tiable  functions  /.  This  observation  is  also  true  for  higher  derivatives:  For  k-times 
continuously  differentiable  functions  the  order  of  differentiation  of  the  kt\\  partial 
derivatives  is  irrelevant  (Theorem  of  Schwarz1),  see  [3,  Chap.  15,  Theorem  1.1]. 


1 5.4  The  Frechet  Derivative 


Our  next  topic  is  the  study  of  a  simultaneous  variation  of  both  variables  of  the 
function.  This  leads  us  to  the  notion  of  the  Frechet  derivative.  For  functions  of  one 
variable,  cp  :  M  — >  M,  the  derivative  was  defined  by  the  limit 


lim 


x^a 


<p(x)  -  tp(a) 

x  —  a 


For  functions  of  two  variables  this  expression  does  not  make  sense  anymore  as  one 
cannot  divide  by  vectors.  We  therefore  will  make  use  of  the  equivalent  definition  of 
the  derivative  as  a  linear  approximation 


tp(x)  =  (p(a)  +  A  •  (x  —  a)  +  R(x ,  a) 
with  A  =  ip' (a)  and  the  remainder  term  R(x,  a)  satisfying 


R(x,a) 

lim  - =  0. 


x-^a  \x  —  a 


This  formula  can  be  generalised  to  functions  of  two  variables. 


^.A.  Schwarz,  1843-1921. 

2M.  Frechet,  1878-1973. 
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Definition  15.10  Let  D  c  Mr  be  open  and  /  :  D  —>  M.  The  function  /  is  called 
Frechet  differentiable  at  the  point  (a,  b)  e  D,  if  there  exists  a  linear  mapping  A  : 
M2  — >  R  such  that 


/(*,  y)  =  f(a,  b)  +  A(x  —  a,  y  —  b)  +  R(x ,  y;  a,  fc) 


with  a  remainder  R(x,  y\  a,b)  fulfilling  the  condition 


lim 


R(x,  y\  a ,  6) 


(x,y)^(a,b)  JJx  -  a) 2  _|_  (-y  _  fc)2 


The  linear  mapping  A  is  called  derivative  of  /  at  the  point  (a,  b).  Instead  of  A  we 
also  write  D  f(a,  b).  The  (lx  2)-matrix  of  the  linear  mapping  is  called  Jacobian J 
of  /.  We  denote  it  by  f'(a ,  /?). 


The  question  whether  the  derivative  of  a  function  is  unique  and  how  it  can  be 
calculated,  is  answered  in  the  following  proposition. 


Proposition  15.11  Let  D  C  M2  be  open  and  f  :  D  M.  If  f  is  Frechet  differen¬ 
tiable  at  (x,  y)  e  D,  then  f  is  also  partially  differentiable  at  (x,  y)  and 


f  (-L  y)  = 


The  components  of  the  Jacobian  are  the  partial  derivatives.  In  particular,  the  Jaco¬ 
bian  and  consequently  the  Frechet  derivative  are  unique. 


Proof  Exemplarily,  we  compute  the  second  component  and  show  that 

0  f 

(. f'(x,y))2  =  -f-(x,y). 

Since  /  is  Frechet  differentiable  at  (x,  y),  it  holds  that 


f{x,  y  +  h)  =  fix,  y)  +  fix,  y) 


0 

h 


+  Rix,  y  +  h;x,  y). 


Therefore 


fix,y  +  h)-fix,y)  ,  ,  Rix,  y  +  h;  x,  y) 

- (/  (x,y))2  =  - 


h 


h 


0  as  h  0. 


Consequently  /  is  partially  differentiable  with  respect  to  y,  and  the  second  compo¬ 
nent  of  the  Jacobian  is  the  partial  derivative  of  /  with  respect  to  y. 


3C.G.J.  Jacobi,  1804-1851. 
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The  next  proposition  follows  immediately  from  the  identity 


lim  f(x,y)=  lim  (  f(a,  b)  +  Df(a,  b)(x  —  a,  y  —  b)  +  R(x,  y;  a, 


(x,y)-+(a,b) 


(x,y)-*(a,b) 

=  f(a,b) 


Proposition  15.12  If  f  is  Frechet  differentiable  then  f  is  continuous.  □ 

In  particular,  the  function 


f(x,y) 


xy 

x2  +  y2’ 

0, 


(x,y)?( 0,  0), 

(x,y)  -  (0, 0) 


is  not  Frechet  differentiable  at  the  point  (0,  0). 

Frechet  differentiability  follows  from  partial  differentiability  under  certain  regu¬ 
larity  assumptions.  In  fact,  one  can  show  that  a  continuously  partially  differentiable 
function  is  Frechet  differentiable,  see  [4,  Chap.  7,  Theorem  7.12]. 


Example  15.13  The  function  /  :  Mr  —>  R  :  (x,  y)  i->  v2e3v  is  Frechet  differen¬ 
tiable,  its  derivative  is 

f\x,  y)  =  [2ve3};,  3x2e3;y]  =  ve3v  [2,  3x]. 

Example  15.14  The  affine  function  /  :  M2  ->  R  with 


f(x,  y)  =  ax  +  Py  +  y  =  [a,  (3] 


x 

y 


+  7 


is  Frechet  differentiable  and  f'(x,  y)  =  [a,  /?]. 


Example  15.15  The  quadratic  function  /  :  M2  — >  R  with 


f(x,  y)  =  ax2  +  2 (3xy  +  yy2  +  5x  +  ey  +  ( 


=  [x,  y] 


a  (3 

P  7 


v 

7 


+  [5,  s] 


X 

y 


+  C 


is  Frechet  differentiable  with  the  Jacobian 


f\x,  y)  =  [2ax  +  2 f3y  +  5 ,  2 fix  +  2 yy  +  e]  =  2[v,  y] 


a  (3 

P  7 


+  [<5,  £]. 


The  chain  rule.  Now  we  are  in  the  position  to  generalise  the  chain  rule  to  the  case 
of  two  variables. 


1 5.4  The  Frechet  Derivative 
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Proposition  15.16  Let  D  C  M2  be  open  and  f  :  D  — >  M  :  (x,  y)  fix,  y)  Fre¬ 
chet  differentiable.  Furthermore  let  I  C  M  an  <9/?£/2  interval  and  f,  f  :  /  — >  R 
differentiable.  Then  the  composition  of  functions 

F  :  I  M. :  t  i->  F(f)  =  f{fft ),  ^(7)) 

w  a/sa  differentiable  and 

dF  df  ,  x  d</>  9/  /  x  d-0 

-(,)  =  -(«,).  #%(»  +  ^(«»,  «<%<». 

Proof  From  the  Frechet  differentiability  of  /  it  follows  that 

Fit  +h)~  Fit )  =  f(<p(t  +  h),  ip(t  +  h))  -  ip{t )) 

=  f'{4>it),  ip(t))  +  R(Ht  +  h),ip(t  +  hy,4>{t),ip(t)). 

We  divide  this  expression  by  h  and  subsequently  examine  the  limit  as  h  — >  0.  Let 

g(L  /i)  =  (0(7  H-/i)  —  0(f))2  +  {fit  +  h)  —  f(t))2.  Then,  due  to  the  differentiability 
of  f,f  and  f,  we  have 

R(4>(t  +  h),  ip(t  +  h);  ip(t))  VgiFh)  n 

hm  - .  - -  • - =  0. 

0  fgftfh)  h 

Therefore,  the  function  F  is  differentiable  and  the  formula  stated  in  the  proposition 
is  valid.  □ 


Example  15.17  Let  D  C  M2  be  an  open  set  that  contains  the  circle  x2  +  y2  =  1  and 
let  /  :  D  R  be  a  differentiable  function.  Then  the  restriction  F  of  /  to  the  circle 

F  :  R  — >  R  :  t  i->  ficost,  sin  f) 

is  differentiable  as  a  function  of  the  angle  t  and 

dF  df (  .  \  .  df  (  .  v 

—  (0  =  —  — —  (cosf,  smn  •  sinf  +  — —  (cosf,  smn  •  cosF 
df  ax  ay 

For  instance,  for  fix,  y)  =  x2  —  y 2  the  derivative  is  ^  ft)  =  —4  cos  £  sin  f. 

Interpretation  of  the  Frechet  derivative.  Using  the  Frechet  derivative  we  obtain, 
like  in  the  case  of  one  variable,  the  linear  approximation  gix ,  y)  to  the  graph  of  the 
function  at  ia ,  b) 


g(x,y)  =  f(a,b)  +  f'(a,b)  X  ®  &  f(x,y ). 
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Now  we  want  to  interpret  the  plane 


z  =  f(a,  b)  +  f(a,b ) 


x  —  a 
y  -  b 


geometrically.  For  this  we  use  the  fact  that  the  components  of  the  Jacobian  are  the 
partial  derivatives.  With  that  we  can  write  the  above  equation  as 

df  df 

z  =  f(a ,  b)  +  —(a,  b)  •  (x  -  a)  +  —(a,  b)  •  (y  -  b), 

ox  dy 

or  alternatively  in  parametric  form  (x  —  a  =  A,  y  —  b  =  p) 


a 

1 

"  0  " 

y 

z 

— 

b 

Jia,  b)_ 

+  A 

0 

ld{ia,b)\ 

+  /i 

1 

ldfy(a'bU 

The  plane  intersects  the  graph  of  /  at  the  point  (a,  b,  f  (a,  b))  and  is  spanned  by  the 
tangent  vectors  to  the  coordinate  curves.  The  equation 

df  df 

z  =  f(a,  b)  +  —(a,  b)  •  (x  —  a)  +  —  (a,  b)  •  (y  -  b ), 


dx 


dy 


consequently  describes  the  tangent  plane  to  the  graph  of  /  at  the  point  (a,  b). 

The  example  shows  that  the  graph  of  a  function  which  is  Frechet  differentiable 
at  the  point  (x,  y)  possesses  a  tangent  plane  at  this  point.  Note  that  the  existence  of 
tangents  to  the  coordinate  curves  does  not  imply  the  existence  of  a  tangent  plane, 
see  Remark  15.6. 

Example  15.18  We  calculate  the  tangent  plane  at  a  point  on  the  northern  hemisphere 
(with  radius  r) 


f(x,  y)  =  Z  =  Y  r2  -  x2  -  y2. 

Let  c  =  f  (a,  b)  =  sir1  —  a2  —  b2.  The  partial  derivatives  of  /  at  (a,  b )  are 


df 


(a,  b)  =  - 


a 


a 

c 


df 


dx  '"7  '  Vr2  -  a2  -b2  c  dy 

Therefore,  the  equation  of  the  tangent  plane  is 


(a,  b)  = - 


b 


Vr2  —  a2  —  b2 


b 

c 


a  b 

Z  —  c - (x  -  a) - (y  -  b), 

c  c 


or  alternatively 


a(x  —  a)  +  b(y  —  b)  +  c(z  —  c)  =  0. 


The  last  formula  actually  holds  for  all  points  on  the  surface  of  the  sphere. 


1 5.5  Directional  Derivative  and  Gradient 
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1 5.5  Directional  Derivative  and  Gradient 


So  far  functions  /  :  D  c  M2  ->  M  were  defined  on  M2  as  a  point  space.  For  the  pur¬ 
pose  of  directional  derivatives  it  is  useful  and  customary  to  write  the  arguments 
(x,  y)  e  M2  as  position  vectors  x  =  [x,  y]T .  In  this  way  each  function  f  :  D  C 
Mr  — >  M  can  also  be  considered  as  a  function  of  column  vectors.  We  identify  these 
two  functions  and  will  not  distinguish  between  f(x,y )  and  /(x)  henceforth. 

In  Sect.  15.3  we  have  defined  partial  derivatives  along  coordinate  axes.  Now  we 
want  to  generalise  this  concept  to  differentiation  in  any  direction. 


Definition  15.19  Let  D  c  M2  be  open,  x  =  [x,  y]T  e  D  and  /  :  D  M.  Further¬ 
more  let  vg!2  with  II  v II  =  1.  The  limit 


dYf  (x) 


df 

dx 


(x)  = 


lim 

/z — >0 


=  lim 

/z — >0 


/(x  +  hx)  -  f(x) 
h 

f(x  +  hvi,y  +  hv2) 


h 


f(x,y) 


(in  case  it  exists)  is  called  directional  derivative  of  /  at  x  in  direction  v. 


The  partial  derivatives  are  special  cases  of  the  directional  derivative,  namely  the 
derivatives  in  direction  of  the  coordinate  axes. 

The  directional  derivative  dyf(x)  describes  the  rate  of  change  of  the  function  / 
at  the  point  x  in  the  direction  of  v.  Indeed,  this  can  been  seen  from  the  following. 
Consider  the  straight  line  {x  +  t\  \  t  e  M}  C  M2  and  the  function 


g(t)  =  f(x  +  tx)  (/  restricted  to  this  straight  line) 
with  g(0)  =  /(x).  Then 


g\ 0)  =  lim 

h^O 


9(h)  -  g( 0) 
h 


=  lim 

/z — >0 


/(x  +  Av)  -  /(x) 
h 


=  Dyf  (X). 


Next  we  clarify  how  the  directional  derivative  can  be  computed.  For  that  we  need 
the  following  definition. 


Definition  15.20  Let  D  c  M2  be  open  and  /  :  D  — M  partially  differentiable.  The 
vector 


V/(*,y) 


fix,  y)T 


is  called  gradient  of  /. 
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Fig.  15.6  Geometric 
interpretation  of  V  / 


Proposition  15.21  LetD  C  Mrbeopen,y  =  [v\,  v2]J  £  M2,  ||v||  =  land  f  :  D 
R  Frechet  differentiable  atx=  [x,  y]J.  Then 


Of  Of 

dy  f(x)  =  (V  /  (x),  v)  =  fix,  y)\  =  -X{x,y)v  i  +  ~T(x,  y )  u2. 


Oy 


Proof  Since  /  is  Frechet  differentiable  at  x,  it  holds  that 


/(x  +  hy)  =  fix)  +  f\x)  •  hy  +  7?(x  +  /n;i,  y  +  /*n2;  x,  j) 


and  hence 


f(x  +  hy)  -  f(x)  Rix  +  hvi,  y  +  hv2',x,  y) 

- - - =  /  (x)  •  v  + - t - . 

h  h 

Letting  h  —>  0  proves  the  desired  assertion.  □ 

Proposition  15.22  (Geometric  interpretation  of  V)  Let  D  C  M2  be  open  and  f  : 
D  —>  R  continuously  differentiable  atx=  (v,  _y)  with  f\x)  /  [0,  0].  Then  V/(x) 
is  perpendicular  to  the  level  curve  Nf(x)  =  {x  e  M2  ;  f(x)  =  fix)}  and  points  in 
direction  of  the  steepest  ascent  of  f,  see  Fig.  15.6. 

Proof  Let  v  be  a  tangent  vector  to  the  level  curve  at  the  point  x.  From  the  implicit 
function  theorem  (see  [4,  Chap.  14.1])  it  follows  that  A//(X)  can  be  parametrised  as 
a  differentiable  curve  7 ft)  =  [x it) ,  y it)]J ,  with 

7(0)  =  x  and  7(0)  =  v 

in  a  neighbourhood  of  x.  Thus,  for  all  t  near  t  =  0, 

/  (7(0)  =  /(x)  =  const. 

Since  /  and  7  are  differentiable,  it  follows  from  the  chain  rule  (Proposition  15.16) 
that 


/'(7(0))  7(0)  =  <V/(x).  v) 


1 5.5  Directional  Derivative  and  Gradient 
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because  7(0)  =  x  and  7(0)  =  v.  Hence  V/(x)  is  perpendicular  to  v.  Let  w  e  M2  be 
a  further  unit  vector.  Then 

<9w/(x)  =  =  (V/(x),  w)  =  ||  V/ (x) ||  •  1 1 xv ||  -cos<, 

aw 

where  <  denotes  the  angle  enclosed  by  V/(x)  and  w.  From  this  formula  one  deduces 
that  dw  f  (x)  is  maximal  if  and  only  if  cos  <  =  1 ,  which  means  V  /  (x)  =  Aw  for  some 
A  >  0.  □ 

Example  15.23  Let  f(x ,  y)  =  x2  +  y2.  Then  V/(x,  y)  =  2[x,  y]T. 


1 5.6  The  Taylor  Formula  in  Two  Variables 

Let  /  :  D  C  M2  — >  R  be  a  function  of  two  variables.  In  the  following  calculation  we 
assume  that  /  is  at  least  three  times  continuously  differentiable.  In  order  to  expand 
f(x+h,y  +  k)  into  a  Taylor  series  in  a  neighbourhood  of  (x,  y),  we  first  fix  the 
second  variable  and  expand  with  respect  to  the  first: 

d  f  i  d2  f 

f(x+h,  y  +  k)  =  f(x,  y  +  k)  +  —~(x,  y  +  k)  ■  h  +  --^-(x,  y  +  k)-  h2  +  G(h3). 

ox  2  dxz 

Then  we  also  expand  the  terms  on  the  right-hand  side  with  respect  to  the  second 
variable  (while  fixing  the  first  one): 


fix,  y  +  k) 
df 


fix,  y)  +  +-{x,  y)  ■  k  +  \l+(x,  y)  ■  k2  +  0(k 3), 


dy 


2  dy- 


dx 

d+_ 

dx 2 


(x,  y  +  k)  — 


(x,  y  +  k)  = 


df  d2f  9 

J  ( x ,  y)  +  ——(x,  y)  ■  k  +  0{k2), 


dx 

d+ 

dx 2 


dydx 
(x,  y)  +  0(k). 


Inserting  these  expressions  into  the  equation  above,  we  obtain 

df  df 

fix  +h,y  +  k)  =  /(x,  y)  +  — ^ -(x,  y)  •  h  +  y)  •  k 


dx 


dy 


1  d2  f  9  1  d2f  9  d2f 

+  A)  •  h  +  —  -y.  2  (x 5  y)-k2  +  ^-r-(x,  y)  •  hk 

2  dxz  2  dyz  dydx 

+  G(h3)  +  G(h2k)  +  OQik2)  +  G(k3). 


In  matrix- vector  notation  we  can  also  write  this  equation  as 


fix  +  h,  y  +  k)  =  f{x,  y)  +  f\x,  y) 


h 

k 


1 

+  -  [h,k]  •  Hf(x,  y) 


'h 

k 


+ 
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with  the  Hessian  matrix ;4 


Hf(x,  y )  = 


aV 

dx 2 

g2/ 

_dxdy 


(x,  y) 


(x,  y) 


d2/ 

dydx 

d2f 


(x,  y) 


dy 


(x,y) 


collecting  the  second-order  partial  derivatives.  By  the  above  assumptions,  these 
derivatives  are  continuous.  Thus  the  Hessian  matrix  is  symmetric  due  to  Schwarz’s 
theorem. 


Example  15.24  We  compute  the  second-order  approximation  to  the  function  /  : 
Mr  — >  M  :  (x,  y)  i->  x2  sin  y  at  the  point  (a,  b)  =  (2,  0).  The  partial  derivatives  are 


/ 

a/ 

dx 

df 

dy 

d2f 

dx2 

92f 

dydx 

92f 

dy2 

General 

x2  sin  v 

2x  sin  v 

x2 cos  y 

2  sin  v 

2x  cos  y 

—x2  sin  y 

At  (2,  0) 

0 

0 

4 

0 

4 

0 

Therefore,  the  quadratic  approximation  g(x,  y)  ~  f{x,  y)  is  given  by  the  formula 


g(x,y)  =  f  (2,0)  + f\2,0) 


x  —  2 

y 


+  ^  [x  -  2,  y]  •  Hfi 2,  0) 


v  —  2 

y 


x  —  2 

i 

"0  4" 

v  —  2 

y 

+  -  [x  -  2,  y] 

4  0 

y 

=  0  +  [0,  4] 

=  4y +  4y(v  -2)  =  4y(v  -  1) 


1 5.7  Local  Maxima  and  Minima 

Let  D  C  M2  be  open  and  /  :  D  — >►  R.  In  this  section  we  investigate  the  graph  of  the 
function  /  with  respect  to  maxima  and  minima. 

Definition  15.25  The  scalar  function  /  has  a  local  maximum  (respectively,  local 
minimum)  at  (a,  b)  e  D,  if 

f(x,y)<f(a,b)  (respectively,  f(x,y)>f(a,b)). 


4L.O.  Hesse,  1811-1874. 
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Fig.  15.7  Local  and  isolated  local  minima.  The  function  in  the  picture  on  the  left  has  local  minima 
along  the  straight  line  y  =  2.  The  minima  are  not  isolated.  The  function  in  the  middle  picture  has  an 
isolated  minimum  at  (x ,  y)  =  (0,  0).  This  minimum  is  even  a  global  minimum.  Finally,  the  function 
in  the  picture  on  the  right-hand  side  has  also  an  isolated  minimum  at  (x,  y)  =  (0,  0).  However,  the 
function  is  not  differentiable  at  that  point 


for  all  (x,  y )  in  a  neighbourhood  of  (a,  b).  The  maximum  (minimum)  is  called 
isolated ,  if  ( a ,  b)  is  the  only  point  in  a  neighbourhood  with  this  property. 

Figure  15.7  shows  a  few  typical  examples.  One  observes  that  the  existence  of  a 
horizontal  tangent  plane  is  a  necessary  condition  for  extrema  (i.e.  maxima  or  minima) 
of  differentiable  functions. 


Proposition  15.26  Let  f  be  partially  differentiable.  If  f  has  a  local  maximum  or 
minimum  at  (a,  b)  e  D,  then  the  partial  derivatives  vanish  at  ( a ,  b): 


df  df 

(< a,b )  =  ——(a,  b)  =  0. 


dx 


dy 


If  in  addition,  f  is  Frechet  differentiable,  then  f'(a,  b)  =  [0,  0],  i.e.  f  has  a  hori¬ 
zontal  tangent  plane  at  ( a ,  b). 


Proof  Due  to  the  assumptions,  the  function  g(h)  =  f  (a  -\-h,b)  has  an  extremum  at 
h  =  0.  Thus,  Proposition  8.2  implies 


df 

dx 


(a,  b)  —  0. 


Likewise  one  can  show  that 


%(a,b)  =  0. 


□ 


Definition  15.27  Let  /  be  a  Frechet  differentiable  function  with  fr (a,  b)  =  [0,  0]. 
Then  (a,  b)  is  called  a  stationary  point  of  /. 


Stationary  points  are  consequently  candidates  for  extrema.  Conversely,  not  all 
stationary  points  are  extrema,  they  can  also  be  saddle  points.  We  call  (a,  b)  a  saddle 
point  of  /,  if  there  is  a  vertical  cut  through  the  graph  which  has  a  local  maximum  at 
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(a,  b),  and  a  second  vertical  cut  which  has  a  local  minimum  at  (a,  b),  see,  for  example, 
Fig.  15.2.  In  order  to  decide  what  is  the  case,  one  resorts  to  Taylor  expansion,  similarly 
as  for  functions  of  one  variable. 

Let  a  =  [a,  b]T  be  a  stationary  point  of  /  and  v  e  Mr  any  unit  vector.  We  investi¬ 
gate  the  behaviour  of  /,  restricted  to  the  straight  line  a  +  Av,  A  e  R.  Taylor  expan¬ 
sion  shows  that 


/(a  +  Av)  =  /(a)  +  /'(a)  •  Av  +  \\2\JHf{a)  v  +  £>( AJ) 


Since  a  is  a  stationary  point,  it  follows  that  /'(a)  =  [0,  0]  and  consequently 

/(a  + Av)  -  /(a)  i  T u  /  \  ,  /o/AA 

- ^ - =  h  v  #/(a)v  +  C>(A). 

For  small  A  the  sign  on  the  left-hand  side  is  therefore  determined  by  the  sign  of 
\T Hf( a)  v.  We  ask  how  this  can  be  expressed  by  conditions  on  ///(a).  Writing 


Hf(  a)  = 


a  (3 

P  7 


and 


v  = 


V 

w 


we  get 


yT  Hf{  a)  v  =  av2  +  2  (3vw  +  yu;2. 

For  an  isolated  local  minimum  this  expression  has  to  be  positive  for  all  v  /  0.  If 
w  =  0  and  v  ^  0,  then  av2  >  0  and  therefore  necessarily 

a  >  0. 

If  w  /  0,  we  substitute  v  =  tw  with  t  e  R  and  obtain 

at2w2  +  2/3tw2  +  yu;2  >  0, 

or  alternatively  (multiplying  by  a  >  0  and  simplifying  by  u;2) 

t2a2  +  2fa/?  +  erf  >  0. 


Therefore, 

(ta  +  f3)2  +  aj  —  (32  >  0 

for  all  t  e  R.  The  left-hand  side  is  smallest  for  t  =  —(3 /a.  Inserting  this  we  obtain 
the  second  condition 

det  Hf( a)  =  oc 7  —  (32  >  0 

in  terms  of  the  determinant,  see  Appendix B.l. 

We  have  thus  shown  the  following  result. 
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Proposition  15.28  The  function  f  has  an  isolated  local  minimum  at  the  stationary 
point  a,  if  the  conditions 


d^l 

dx 2 


(a)  >0 


and 


det  Hf( a)  >  0 


are  fulfilled. 

By  replacing  /  by  —  /  one  gets  the  corresponding  result  for  isolated  maxima. 

Proposition  15.29  The  function  f  has  an  isolated  local  maximum  at  the  stationary 
point  a,  if  the  conditions 


dx 2 


(a)  <0 


and 


det  Hf( a)  >  0 


are  fulfilled. 

In  a  similar  way  one  can  prove  the  following  assertion. 

Proposition  15.30  The  stationary  point  a  of  the  function  f  is  a  saddle  point,  if 
det  Hf( a)  <  0. 

If  the  determinant  of  the  Hessian  matrix  equals  zero,  the  behaviour  of  the  function 
needs  to  be  investigated  along  vertical  cuts.  One  example  is  given  in  Exercise  12. 

Example  15.31  We  determine  the  maxima,  minima  and  saddle  points  of  the  function 
fix,  y)  =  x6  +  y6  —  3x2  —  3  y2.  The  condition 

f'(x,  y)  =  [6x5  -  6x,  6 y5  -  6 y]  =  [0,  0] 

gives  the  following  nine  stationary  points 

XI  =  0,  X2,3  =  ±1,  yi  =  0,  y2,3  =  ±1. 

The  Hessian  matrix  of  the  function  is 


30x4  -  6  O' 

0  30 y4  -  6  ' 

Applying  the  criteria  of  Propositions  15.28  through  15.30,  we  obtain  the  following 
results:  The  point  (0,  0)  is  an  isolated  local  maximum  of  /,  the  points  (—1,  —  1), 
(— 1,  1),  (1,  —1)  and  (1,  1)  are  isolated  local  minima,  and  the  points  (—1,  0),  (1,  0), 
(0,  —1)  and  (0,  1)  are  saddle  points.  The  reader  is  advised  to  visualise  this  function 
with  maple. 
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1 5.8  Exercises 


1.  Compute  the  partial  derivatives  of  the  functions 


f(x,y )  =  arcsin 


g(x,  y )  =  log 


1 

^Jx2  +  y2 


Verify  your  results  with  maple . 

2.  Show  that  the  function 


v(x,  t)  — 


1 

VF 


exp 


satisfies  the  heat  equation 

dv 

dt 

for  t  >  0  and  v  e  R. 

3.  Show  that  the  function  w(x,t )  =  g(x 


d2v 

dx2 

kt)  satisfies  the  transport  equation 


dw 

dt 


+  k 


dw 

dx 


for  any  differentiable  function  g. 

4.  Show  that  the  function  g(x,  y)  =  log(v2  +  2y2)  satisfies  the  equation 


d2g  1  d2g 
dx2  2  dy2 


for  (x,y)  ^  (0,0). 

5.  Represent  the  ellipsoid  x2  +  2y2  +  z2  =  1  as  graph  of  a  function  (x,  y)  \-+ 
f(x,y).  Distinguish  between  positive  and  negative  z -coordinates,  respectively. 
Compute  the  partial  derivatives  of  /  and  sketch  the  level  curves  of  /.  Find  the 
direction  in  which  V/  points. 

6.  Solve  Exercise  5  for  the  hyperboloid  x2  +  2y2  —  z2  =  1. 

7.  Compute  the  directional  derivative  of  the  function  f(x,y )  =  vy  in  the  direction 
v  at  the  four  points  ai , . . .  a4,  where 

ai  =  (1,  2),  a2  =  (-1,  2),  a3  =  (1,  -2),  a4  =  (-1,  -2)  and  v  = 

At  the  given  points  ai,  . . .  a4,  determine  the  direction  for  which  the  directional 
derivative  is  maximal. 
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8.  Consider  the  function  f(x,y)  —  4  —  x2  —  y2. 

(a)  Determine  and  sketch  the  level  curves  f(x,y )  =  c  for  c  =  4,2,0,  —2  and 
the  graphs  of  the  coordinate  curves 


X 

a 

x  \-> 

b 

,  y 

y 

Jix,  b)_ 

J(a,  y)_ 

for  a,  b  =  —  1,  0,  1. 

(b)  Compute  the  gradient  of  /  at  the  point  (1,1)  and  determine  the  equation  of 
the  tangent  plane  at  (1,  1,2).  Verify  that  the  gradient  is  perpendicular  to  the 
level  curve  through  (1,  1,2). 

(c)  Compute  the  directional  derivatives  of  /  at  (1,  1)  in  the  directions 


1 

T 

1 

"-1" 

1 

1 

'  1  ‘ 

Vi  =  -= 

V2 

_1_ 

[CN 

II 

CN 

_  1  _ 

’V3=V2 

,  V4  =  /- 

V2 

Sketch  the  vectors  Vi,  . . . ,  V4  in  the  (x,  y) -plane  and  interpret  the  value  of 
the  directional  derivatives. 

9.  Consider  the  function  f(x,  y)  =  yt2x~y ,  where  v  =  x{t)  and  y  =  y(t)  are  dif¬ 
ferentiable  functions  satisfying 

jc(0)  =  2,  y(0)  =  4,  i(0)  =  -l,  y(0)  =  4. 

From  this  information  compute  the  derivative  of  z(t)  =  f(x(t),  y(0)  at  the  point 
t  =  0. 

10.  Find  all  stationary  points  of  the  function 

fix,  y)  =x3  -  3  xy2  +  6  y. 

Determine  whether  they  are  maxima,  minima  or  saddle  points. 

11.  Find  the  stationary  point  of  the  function 

fix,  y)  =  ex  +  yey  -  x 

and  determine  whether  it  is  a  maximum,  minimum  or  a  saddle  point. 

12.  Investigate  the  function 

fix,  y)  =  x4  -  3  x2y  +  y3 

for  local  extrema  and  saddle  points.  Visualise  the  graph  of  the  function. 

Hint.  To  study  the  behaviour  of  the  function  at  (0,  0)  consider  the  partial  map- 
pings  fix,  0)  and  /( 0,  y). 

13.  Determine  for  the  function 


fix,y)=x2^3iy-3)-^y2 
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(a)  the  gradient  and  the  Hessian  matrix; 

(b)  the  second-order  Taylor  approximation  at  (0,  0); 

(c)  all  stationary  points.  Find  out  whether  they  are  maxima,  minima  or  saddle 
points. 

14.  Expand  the  polynomial  f(x,  y)  =  x2  +  xy  +  3 y2  in  powers  of  x  —  1  andy  —  2, 
i.e.  in  the  form 

fix,  y)  =  a(x  -  l)2  +  f3(x  -  l)(j  -  2)  +  7  (y  -  2)2 
+  S(x  —  1)  +  <s(y  —  2)  +  C* 

Hint.  Use  the  second-order  Taylor  expansion  at  (1,  2). 

15.  Compute  (0.95)2  01  numerically  by  using  the  second-order  Taylor  approximation 
to  the  function  f(x,y )  =  xv  at  (1,  2). 


® 
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Vector-Valued  Functions  of  Two 
Variables 


In  this  section  we  briefly  touch  upon  the  theory  of  vector- valued  functions  in  several 
variables.  To  simplify  matters  we  limit  ourselves  again  to  the  case  of  two  variables. 

First  we  define  vector  fields  in  the  plane  and  extend  the  notions  of  continuity 
and  differentiability  to  vector-valued  functions.  Then  we  discuss  Newton’s  method 
in  two  variables.  As  an  application  we  compute  a  common  zero  of  two  nonlinear 
functions.  Finally,  as  an  extension  of  Sect.  15.1,  we  show  how  smooth  surfaces  can 
be  described  mathematically  with  the  help  of  parameterisations. 

For  the  required  basic  notions  of  vector  and  matrix  algebra  we  refer  to  the  Appen¬ 
dices  A  and  B. 


1 6.1  Vector  Fields  and  the  Jacobian 

In  the  entire  section  D  denotes  an  open  subset  of  M2  and 

/(*,  y) 

9(x,  y)_ 

a  vector-valued  function  of  two  variables  with  values  in  M2.  Such  functions  are  also 
called  vector  fields  since  they  assign  a  vector  to  every  point  in  the  plane.  Important 
applications  are  provided  in  physics.  For  example,  the  velocity  field  of  a  flowing 
liquid  or  the  gravitational  field  are  mathematically  described  as  vector  fields. 

In  the  previous  chapter  we  have  already  encountered  a  vector  field,  namely  the 
gradient  of  a  scalar- valued  function  of  two  variables  /  :  D  — >►  R  :  (x,  y)  i->  f(x,y). 


F  :  D  C  M2  ->  M2  :  (x,  y)  i-> 


u 

v 


=  F  (x,y)  = 
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For  a  partially  differentiable  function  /  the  gradient 


F  =  V/  :  D  — >  M2  :  (x,  y)  i-> 


"5/ 

cbc 


(*,  y) 


df 

_dy 


(x,y) 


is  obviously  a  vector  field. 

Continuity  and  differentiability  of  vector  fields  are  defined  componentwise. 
Definition  16.1  The  function 

F:DcI2->I2:(rj)^  F(x,  y)  = 

is  called  continuous  (or  partially  differentiable  or  Frechet  differentiable,  respectively) 
if  and  only  if  its  two  components  /:£>—»■  R  and  g  :  D  — R  have  the  corresponding 
property,  i.e.  they  are  continuous  (or  partially  differentiable  or  Frechet  differentiable, 
respectively). 


f(x,y) 

9(x,y ) 


If  both  /  and  g  are  Frechet  differentiable,  one  has  the  linearisations 


f(x,y)  =  f  (a,  b)  + 
g(x,  y)  =  g(a,  b )  + 


x  —  a 
y  -b 


+  R i(x,  y;  a,  b), 


x  —  a 
y  -  b 


+  R2(x,  y;  a,  b) 


for  (x,  y)  close  to  (a,  b)  with  remainder  terms  R\  and  R2.  If  one  combines  these  two 
formulas  to  one  formula  using  matrix- vector  notation,  one  obtains 


fix,  y) 

f(a,  b) 

_g(x,  y)_ 

_g(a ,  b)_ 

+ 


x  —  a 

y  -  b 


R \{x,  y;  a,  b) 
R2(x,  y;  a,  b) 


or  in  shorthand  notation 


F(x,y)=F  (a,b)  +  F'(a,b) 


+  R  (x,y;a,b) 


with  the  remainder  term  R(x,  y;  a,  b)  and  the  (2x2 )-Jacobian 


F \a,b)  = 


16.1  Vector  Fields  and  the  Jacobian 
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The  linear  mapping  defined  by  this  matrix  is  called  (Frechet)  derivative  of  the  func¬ 
tion  F  at  the  point  (a,  b).  The  remainder  term  R  has  the  property 


y/Rl(x,  y,  a,  bV  +  foix,  y;  a,  £)2 

lim  -  — - 

(x,y)^(a,b)  JJx  -  a) 2  +  (y  -  b) 2 


Example  16.2  (Polar  coordinates)  The  mapping 


F  :  M2  — >  M2  :  (r,  </>)  i-> 


r  cos  p 

_.y_ 

r  sin  p 

is  (everywhere)  differentiable  with  derivative  (Jacobian) 


F'(r,  ip) 


cos  p  —r  sin  p 
sin  p  r  cos  (p 


1 6.2  Newton's  Method  in  Two  Variables 

The  linearisation 


F(v,  y)  ^  F (a,b)  +  F \a,  b) 

is  the  key  for  solving  nonlinear  equations  in  two  (or  more)  unknowns.  In  this  section, 
we  derive  Newton’s  method  for  determining  the  zeros  of  a  function 


x  —  a 

y  -b 


F  (*,;y) 


g(x,y) 


of  two  variables  and  two  components. 


Example  16.3  (Intersection  of  a  circle  with  a  hyperbola)  Consider  the  circle  x2  + 
y2  =  4  and  the  hyperbola  xy  =  1.  The  points  of  intersection  are  the  zeros  of  the 
vector  equation  F(v,  y)  =  0  with 


F  :  M2  ->  M2  :  F(jc,  y)  = 


f(x,y ) 

1 

T-t 

1 

<N 

+ 

<N 

X 

1 _ 

_g(x, y)_ 

xy  -  1 

The  level  curves  f(x,  y)  =  0  and  g(x,  y)  =  0  are  sketched  in  Fig.  16.1. 
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Fig.  16.1  Intersection  of  a 
circle  with  a  hyperbola 


Newton’s  method  for  determining  the  zeros  is  based  on  the  following  idea.  For  a 
starting  value  (xq,  yo)  which  is  sufficiently  close  to  the  solution,  one  computes  an 
improved  value  by  replacing  the  function  by  its  linear  approximation  at  (vo,  yo) 

F(x,  y )  F(x0,  yo)  +  yo) 


X  —  Xo 

y  -yo 


The  zero  of  the  linearisation 

F(x0,  yo)  +f,(*o,  yo) 

is  taken  as  improved  approximation  (x\,  yi),  so 


- 1 

X 

1 

X 

o 

i 

O 

i _ 

1 

o 

o 

_ 1 

F'(*o,  To) 


xi  -  xo 

yi  -  yo 


-F(v0,  yo), 


and 


x\ 

yi 


xo 

yo 


-i 

F(*o,  yo), 


respectively.  This  can  only  be  carried  out  if  the  Jacobian  is  invertible,  i.e.  its  deter¬ 
minant  is  not  equal  to  zero.  In  the  example  above  the  Jacobian  is 


F  '(x,y) 


2x  2  y 

y  x 


with  determinant  detF^v,  y)  =  2x2  —  2y2.  Thus  it  is  singular  on  the  straight  lines 
x  =  =by.  These  lines  are  plotted  as  dashed  lines  in  Fig.  16.1. 

The  idea  now  is  to  iterate  the  procedure,  i.e.  to  repeat  Newton’s  step  with  the 
improved  value  as  new  starting  value 


^+1 

xk 

yk+i 

yk 

~dj_ 

dx 

dg_ 

dx 


(xk,  yu ) 
(■ xk ,  yk ) 


df 

dy 

dg 


-l 


(xk,  yk) 


(.Xk,  yk) 


f(xk ,  yk) 
g(xk,  yk) 
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for  k  =  1,  2,  3, . . .  until  the  desired  accuracy  is  reached.  The  procedure  generally 
converges  rapidly  as  is  shown  in  the  following  proposition.  For  a  proof,  see  [23, 
Chap.  7,  Theorem 7.1]. 

Proposition  16.4  Let  F  :  D  —>  Mr  be  twice  continuously  differentiable  with 
F(a,  b)  =  0  and  detF \a,  b)  /  0.  If  the  starting  value  (vo,  yo)  lies  sufficiently  close 
to  the  solution  ( a ,  b)  then  Newton  ’s  method  converges  quadratically. 

One  often  sums  up  this  fact  under  the  term  local  quadratic  convergence  of  New¬ 
ton  ’s  method. 

Example  16.5  The  intersection  points  of  the  circle  and  the  hyperbola  can  also  be 
computed  analytically.  Since 


1 

xy  =  1  v  =  — 

y 

we  may  insert  x  =  l/y  into  the  equation  v2  +  y2  =  4to  obtain  the  biquadratic 
equation 

/  -  4y2  +  1  =  0. 

By  substituting  y2  =  u  the  equation  is  easily  solvable.  The  intersection  point  with 
the  largest  v -component  has  the  coordinates 

X  =^2  +  73  =  1.93185165257813657  . . . 
y  =\j 2  —  73  =  0.51763809020504152 . . . 

Application  of  Newton’s  method  with  starting  values  xo  =  2  and  yo  =  1  yields 
the  above  solution  in  5  steps  with  16  digits  accuracy.  The  quadratic  convergence  can 
be  observed  from  the  fact  that  the  number  of  correct  digits  doubles  with  each  step. 


X 

y 

Error 

2 .000000000000000 

1.000000000000000 

4 . 87152141817 5E-001 

2 .000000000000000 

5 . 000000000000000E-001 

7 . 039388810410E-002 

1.933333333333333 

5 . 166666666666667E-001 

1. 771734052 060E-003 

1.931852741096439 

5 . 17 63 7 05482 192 87E- 001 

1. 5022950057 04E-006 

1.931851652578934 

5 . 17 63 8 09 02 042 443E- 001 

1.127875985998E-012 

1.931851652578136 

5 . 176380902050416E-001 

2 .220446049250E-016 
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Experiment  16.6  Using  the  MATLAB  programs  matl6_l.m  and  matl6_2.m 
compute  the  intersection  points  from  Example  16.3.  Experiment  with  different  start¬ 
ing  values,  and  this  way  try  to  determine  all  four  solutions  to  the  problem.  What 
happens  if  the  starting  value  is  chosen  to  be  (vq,  yo)  =  (1,  1)? 


1 6.3  Parametric  Surfaces 


In  Sect.  15.1  we  investigated  surfaces  as  graphs  of  functions  /  :  D  c  M2  — >  R.  How¬ 
ever,  similar  to  the  case  of  curves,  this  concept  is  too  narrow  to  represent  more 
complicated  surfaces.  The  remedy  is  to  use  parameterisations  like  it  was  done  for 
curves. 

The  starting  point  for  the  construction  of  a  parametric  surface  is  a  (componentwise) 
continuous  mapping 


(u,  v )  i->  x(u,  n)  = 


x(u,  n) 

y  (w,  v) 
z(u ,  v) 


of  a  parameter  domain  D  C  M2  to  l3.  By  fixing  one  parameter  u  =  uo  or  v  =  no  at 
a  time  one  obtains  coordinate  curves  in  space 


u  i->  x(u,  no)  ...  w-curve 
v  i->  x(^o,  v)  ...  n-curve 


Definition  16.7  A  regular  parametric  surface  is  defined  by  a  mapping  D  cM2^ 
M3  :  (m,  v)  i->  x(u,  v)  which  satisfies  the  following  conditions 

(a)  the  mapping  (u,  v)  \-+  x(u,  v )  is  injective; 

(b)  the  u -curves  and  the  v -curves  are  continuously  differentiable; 

(c)  the  tangent  vectors  to  the  u  -curves  and  v  -curves  are  linearly  independent  at  every 
point  (thus  always  span  a  plane). 


These  conditions  guarantee  that  the  parametric  surface  is  indeed  a  two-dimensional 
smooth  subset  of  M3 . 

For  a  regular  surface,  the  tangent  vectors 


dx 

du 


(u,  v )  = 


r  dx 

r  dx  i 

n  (u,  v) 
ou 

..  (m,  v) 
ov 

dy 

dx 

dy 

„  (m,  v) 
ou 

,  (u,  V)  - 

ov 

„  (m,  v) 
ov 

dz 

dz  .  . 

Q  (“>  V) 

L  OU  J 

»  (m’u) 
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span  the  tangent  plane  at  x(u ,  v) .  The  tangent  plane  has  the  parametric  representation 

dx  dx 

p(A,  p)  =  x(u ,  v)  +  A  —  (u,  v)  +  p  —  (u,  v),  A,  p  e  R. 

aw  dv 

The  regularity  condition  (c)  is  equivalent  to  the  assertion  that 

dx  dx 
— —  x  — —  7^  0. 
du  dv 


The  cross  product  constitutes  a  normal  vector  to  the  (tangent  plane  of  the)  surface. 


Example  16.8  (Surfaces  of  rotation)  By  rotation  of  the  graph  of  a  continuously 
differentiable,  positive  function  z  h(z),  a  <  z  <  b,  around  the  z-axis,  one  obtains 
a  surface  of  rotation  with  parametrisation 


D  =  (< a ,  b)  x  (0,  2tt),  x(w,  v) 


h(u)  cos  v 
h(u)  sin  v 
u 


The  v -curves  are  horizontal  circles,  the  u -curves  are  the  generator  lines.  Note  that 
the  generator  line  corresponding  to  the  angle  v  =  0  has  been  removed  to  ensure 
condition  (a).  To  verify  condition  (c)  we  compute  the  cross  product  of  the  tangent 
vectors  to  the  u-  and  the  v -curves 


dx  dx 
du  dv 


h'(u)  cos  v 

—h(u)  sin  v 

—h(u)  cos  v 

h'(u)  sin  v 

X 

h(u) cos  v 

— 

—h(u)  sin  v 

1 

0 

h(u)  h' (u) 

Due  to  h  (u)  >  0  this  vector  is  not  zero;  the  two  tangent  vectors  are  hence  not  collinear. 

Figure  16.2  shows  the  surface  of  rotation  which  is  generated  by  h(u)  =  0.4  + 
cos(47rw)/3,  u  e  (0,  1).  In  MATLAB  one  advantageously  uses  the  command 
cylinder  in  combination  with  the  command  mesh  for  the  representation  of  such 
surfaces. 


Example  16.9  (The  sphere)  The  sphere  of  radius  R  is  obtained  by  the  parametri¬ 
sation 

R  sin  u  cos  v 
R  sin  u  sin  v  . 

R  cos  u 


D  =  (0, 7 r)  x  (0,  2tt), 


x(w,  n)  = 
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Fig.  16.2  Surface  of  rotation,  generated  by  rotation  of  a  graph  h  (z)  about  the  z-axis.  The  underlying 
graph  h  (z)  is  represented  on  the  right 


Fig.  16.3  Unit  sphere  as  parametric  surface.  The  interpretation  of  the  parameters  u,  v  as  angles  is 
given  in  the  picture  on  the  right 


The  n-curves  are  the  circles  of  latitude,  the  w-curves  the  meridians.  The  meaning  of 
the  parameters  m,  r  as  angles  can  be  seen  from  Fig.  16.3. 


16.4  Exercises 

1.  Compute  the  Jacobian  of  the  mapping 


u 

=  F(jc,30  = 

x2  +  y 2 

7  ,,2 

V 

[tC  -  y* J 

For  which  values  of  v  and  y  is  the  Jacobian  invertible? 
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2.  Program  Newton’s  method  in  several  variables  and  test  the  program  on  the 
problem 


v  +  sin  y  =  4 

xy  =  1 


with  starting  values  x  =  2  and  y  =  1.  If  you  are  working  in  MATLAB,  you  can 
solve  this  question  by  modifying  mat  1 6_2  .  m. 

3.  Compute  the  tangent  vectors  ^  and  the  normal  vector  x  to  the  sphere 
of  radius  R  (Example  16.9).  What  can  you  observe  about  the  direction  of  the 
normal  vector? 

4.  Sketch  the  surface  of  revolution 


x(u,  v) 


cos  u  cos  v 
cos  u  sin  v 
u 


—  l<u<l,  0  <  v  <  2tt. 


Compute  the  tangent  vectors  and  the  normal  vector  x  Determine 

the  equation  of  the  tangent  plane  at  the  point  (l/\/2,  1  /  y/2,  0). 

5.  Sketch  the  paraboloid 


x(w,  u) 


u  cos  v 


u  sin  v 
1  —  u2 


0  <  u  <  1,  0  <  n  <  2m 


o  o 

and  plot  some  of  the  u-  and  v -curves.  Compute  the  tangent  vectors  ^  and 

the  normal  vector  x  . 

6.  Plot  some  of  the  and  n -curves  for  the  helicoid 


x(u,  V )  — 


U  COS  V 

u  sin  v 
v 


0  <  u  <  1,  0  <  n  <  2m 


What  kind  of  curves  are  they?  Try  to  sketch  the  surface. 
7.  A  planar  vector  field  (see  also  Sect.  20.1) 


(x,  y )  i  ^  F(x,  y)  = 


fix,  y) 

9(x,  y) 


can  be  visualised  by  plotting  a  grid  of  points  (x; ,  yj )  in  the  plane  and  attaching 
the  vector  F  (x* ,  )  to  each  grid  point.  Sketch  the  vector  fields 


F(x,  y)  = 


1 


•\A2  +  y 2 


X 

i 

and  G(x,  y)  =  - _ 

|_;yj 

X 

in  this  way. 


® 

Check  for 
updates 


Integration  of  Functions  of  Two 
Variables 


In  Sect.  11.3  we  have  shown  how  to  calculate  the  volume  of  solids  of  revolution.  If 
there  is  no  rotational  symmetry,  however,  one  needs  an  extension  of  integral  calculus 
to  functions  of  two  variables.  This  arises,  for  example,  if  one  wants  to  find  the  volume 
of  a  solid  that  lies  between  a  domain  D  in  the  (x,  y)-plane  and  the  graph  of  a  non¬ 
negative  function  z  =  fix,  y).  In  this  section  we  will  extend  the  notion  of  Riemann 
integrals  from  Chap.  11  to  double  integrals  of  functions  of  two  variables.  Important 
tools  for  the  computation  of  double  integrals  are  their  representation  as  iterated 
integrals  and  the  transformation  formula  (change  of  coordinates).  The  integration  of 
functions  of  several  variables  occurs  in  numerous  applications,  a  few  of  which  we 
will  discuss. 


1 7.1  Double  Integrals 

We  start  with  the  integration  of  a  real- valued  function  z  =  fix,  y)  which  is  defined 
on  a  rectangle  R  =  [a,b]x  [c,  d].  More  general  domains  of  integration  D  C  M2will 
be  discussed  below.  Since  we  know  from  Sect.  11.1  that  Riemann  integrable  functions 
are  necessarily  bounded,  we  assume  in  the  whole  section  that  /  is  bounded.  If  / 
is  non-negative,  the  integral  should  be  interpretable  as  the  volume  of  the  solid  with 
base  R  and  top  surface  given  by  the  graph  of  /  (see  Fig.  17.2).  This  motivates  the 
following  approach  in  which  the  solid  is  approximated  by  a  sum  of  cuboids. 

We  place  a  rectangular  grid  G  over  the  domain  R  by  partitioning  the  intervals 
[a,  b ]  and  [c,  d ]  like  in  Sect.  11.1: 

Zx  :  a  =  x o  <  x\  <  V2  <  •  •  •  <  xn-\  <  xn  =  b, 

zy :  c  =  y0  <  yi  <  y2  <  •  •  •  <  ym- 1  <  ym  =  d. 
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The  rectangular  grid  is  made  up  of  the  small  rectangles 

[xi~i,xi]  x  [yj-uyj],  i  =  j  =  1, 

The  mesh  size  0(G)  is  the  length  of  the  largest  subinterval  involved: 

0(G)  =  max  (|.t7  -*7_i|,  \yj  -yj-i\  ;  i  =  1, . . . ,  n,  j  =  1, . . . ,  m). 

Finally  we  choose  an  arbitrary  intermediate  point  p7y  =  ,  r]ij )  in  each  of  the 

rectangles  of  the  grid,  see  Fig.  17.1. 

The  double  sum 

n  m 

S  =  %•)(•*;  -Xi-i)(yj  -yj-i) 

i= 1  7=1 

is  again  called  a  Riemann  sum.  Since  the  volume  of  a  cuboid  with  base  [v7_i ,  Xj ]  x 
lyj-i,  yj]  and  height  f(Qj,  rjij)  is 


fitij,  r]ij)(xi  -Xi-i)(yj  -  yj- 1), 

the  above  Riemann  sum  is  an  approximation  to  the  volume  under  the  graph  of  / 
(Fig.  17.2). 

Like  in  Sect.  11.1,  the  integral  is  now  defined  as  a  limit  of  Riemann  sums.  We 
consider  a  sequence  G\ ,  G2,  G3 ,  . . .  of  grids  whose  mesh  size  0(G^)  tends  to  zero 
as  Af  — >►  00  and  the  corresponding  Riemann  sums  S 


Fig.  1 7.1  Partitioning  the 
rectangle  7? 


Fig.  17.2  Volume  and 
approximation  by  cuboids 
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Definition  17.1  A  bounded  function  z  =  f(x,  y)  is  called  Riemann  integrable  on 
R  =  [a,  b]  x  [c,  d ]  if  for  arbitrary  sequences  of  grids  (GaOn>i  with  @(Gn)  — >  0 
the  corresponding  Riemann  sums  (S)v)n>i  tend  to  the  same  limit  1(f),  indepen¬ 
dently  of  the  choice  of  intermediate  points.  This  limit 

1(f)  =  jj  f(x,y)d(x,y) 
is  called  the  double  integral  of  /  on  R. 

Experiment  17.2  Study  the  M-file  matl7_l.m  and  experiment  with  different 
randomly  chosen  Riemann  sums  for  the  function  z  =  x2  -\-  y2  on  the  rectangle 
[0,1]  x  [0,  1].  What  happens  if  you  choose  finer  and  finer  grids? 

As  in  the  case  of  one  variable,  one  may  use  the  definition  of  the  double  integral 
for  obtaining  a  numerical  approximation  to  the  integral.  However,  it  is  of  little  use 
for  the  analytic  evaluation  of  integrals.  In  Sect.  11.1  the  fundamental  theorem  of 
calculus  has  proven  helpful,  here  the  representation  as  iterated  integral  does.  In  this 
way  the  computation  of  double  integrals  is  reduced  to  the  integration  of  functions  in 
one  variable. 


Proposition  17.3  (The  double  integral  as  iterated  integral)  If  a  bounded  function 
f  and  its  partial  functions  x  i->  f(x,y),  y  f(x,y)  are  Riemann  integrable  on 
R  =  [a,  b]  x  [c,  d\,  then  the  mappings  x  i->  fff  f(x,  y)  d y  and  y  f(x,  y)  dx 

are  Riemann  integrable  as  well  and 

dX  =  Ic  (fa  f(X,y)dX 

Outline  of  the  proof.  If  one  chooses  intermediate  points  in  the  Riemann  sums  of  the 
special  form  p,7  =  (&,  rjj)  with  &  e  x,],  rjj  e  [yj-u  yjl  then 


//. 


f(x,y)d(x,y)  = 


-til 


d 


fix.  y)  dy 


and  likewise  for  the  second  statement  by  changing  the  order.  For  a  rigorous  proof  of 
this  argument,  we  refer  to  the  literature,  for  instance  [4,  Theorem  8.13  and  Corol¬ 
lary]  .  □ 

Figure  17.3  serves  to  illustrate  Proposition  17.3.  The  volume  is  approximated  by 
summation  of  thin  slices  parallel  to  the  axis  instead  of  small  cuboids.  Proposition  17.3 
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Fig.  1 7.3  The  double 
integral  as  iterated  integral 


states  that  the  volume  of  the  solid  is  obtained  by  integration  over  the  area  of  the  cross 
sections  (perpendicular  to  the  x-  or  y-axis).  In  this  form  Proposition  17.3  is  called 
Cavalieri’s  principle.  In  general  integration  theory  one  also  speaks  of  Fubinis 
theorem.  Since  in  the  case  of  integrability  the  order  of  integration  does  not  matter, 
one  often  omits  the  brackets  and  writes 


f(x,  y)d(x,  y ) 


f(x,  y)dx  d y 


f(x,  y)dy  dx. 


Example  17.4  Let  R  =  [0,  1]  x  [0,  1].  The  volume  of  the  body 

B  =  {(*,  y,  z)  e  M3  :  (x,  y)  e  R,  0  <  z  <  x1 2  +  y2} 


is  obtained  using  Proposition  17.3  as  follows,  see  also  Fig.  17.4: 


Fig.  1 7.4  The  body  B 


1B.  Cavalieri,  1598-1647. 

2G.  Fubini,  1879-1943. 


17.1  Double  Integrals 


245 


Fig.  17.5  Area  as  volume  of 
the  cylinder  of  height  one 


We  now  turn  to  the  integration  over  more  general  (bounded)  domains  D  C  Mr. 
The  indicator  function  of  the  domain  D  is 


&d(x,  y ) 


1,  ( x,y)eD , 

0,  (. x,y)<£D . 


We  can  enclose  the  bounded  domain  D  in  a  rectangle  R  (D  c  R).  If  the  Riemann 
integral  of  the  indicator  function  of  D  exists,  then  it  represents  the  volume  of  the 
cylinder  of  height  one  and  base  D  and  thus  the  area  of  D  (Fig.  17.5).  The  result 
obviously  does  not  depend  on  the  size  of  the  surrounding  rectangle  since  the  indicator 
function  assumes  the  value  zero  outside  the  domain  D. 


Definition  17.5  Let  D  be  a  bounded  domain  and  R  an  enclosing  rectangle. 


(a)  If  the  indicator  function  of  D  is  Riemann  integrable  then  the  domain  D  is  called 
measurable  and  one  sets 


d(x,  y) 


lD{x,  y)  d(x,  y). 


(b)  A  subset  N  C  R2  is  called  set  of  measure  zero,  if  f d(x,  y)  =  0. 

(c)  For  a  bounded  function  z  =  f(x,  y),  its  integral  over  a  measurable  domain  D  is 
defined  as 


f(x,  y)  d(x,  v) 


f(x,y)  tD(x,y)d(x,y), 


if  f(x ,  y)lo(x,  y)  is  Riemann  integrable. 


Sets  of  measure  zero  are,  for  example,  single  points,  straight  line  segments  or 
segments  of  differentiable  curves  in  the  plane.  Item  (c)  of  the  definition  states  that 
the  integral  of  a  function  /  over  a  domain  D  is  determined  by  continuing  /  to  a 
larger  rectangle  R  and  assigning  the  value  zero  outside  D. 


Remark  17.6  (a)  If  D  is  a  measurable  domain,  N  a  set  of  measure  zero  and  /  is 
integrable  over  the  respective  domains  then 
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fix,  y)  d(x,  y) 


fix,  y)d(x,  y ). 


(b)  Let  D  =  D\  U  D2.  If  D\  n  D2  is  a  set  of  measure  zero  then 


/(*,  j)d(x,  y) 


fix,  y)  d(x,  j)  + 


/(x,  y)  d(x,  y). 


The  integral  over  the  entire  domain  D  is  thus  obtained  as  sum  of  the  integrals  over 
subdomains.  The  proof  of  this  statement  can  easily  be  obtained  by  working  with 
Riemann  sums. 


An  important  class  of  domains  D  on  which  integration  is  simple  are  the  so-called 
normal  domains. 

Definition  17.7  (a)  A  subset  D  C  M2  is  called  normal  domain  of  type  I  if 

D  =  {(*,  y)  el2  ;  a  <  x  <  b,  v(x)  <  y  <  w(x)} 

with  certain  continuously  differentiable  lower  and  upper  bounding  functions  v  1-^ 
v(x),  x  \->  w(x). 

(b)  A  subset  D  C  Mr  is  called  normal  domain  of  type  II 

D  =  {(x,  y)  e  R2  ;  c  <y  <d,  l(y )  <  x  <  r(y)} 

with  certain  continuously  differentiable  left  and  right  bounding  functions  v  i->  l(x ), 

v  r(x). 


Figure  17.6  shows  examples  of  normal  domains. 


Proposition  17.8  (Integration  over  normal  domains)  Let  D  be  a  normal  domain 
and  f  :  D  — R  continuous.  For  normal  domains  of  type  I,  one  has 


fix,  y)  d(x,  v) 


dx 


Fig.  17.6  Normal  domains  of  type  I  and  II 
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and  for  normal  domains  of  type  II 


fix,  y)  d(x,  v) 


Proof  The  statements  follow  from  Proposition  17.3.  We  recall  that  /  is  extended  by 
zero  outside  of  D.  For  details  we  refer  to  the  remark  at  the  end  of  [4,  Chap.  8.3].  □ 


Example  17.9  For  the  calculation  of  the  volume  of  the  body  lying  between  the  tri¬ 
angle  D  =  {(x,  y)  ;  0<jc<l,0<y<l  —  x}  and  the  graph  of  z  =  x1  +  y2,  we 
interpret  D  as  normal  domain  of  type  I  with  the  boundaries  v(x)  =  0,  w(x)  =  1  —  x. 
Consequently 


as  can  be  seen  by  multiplying  out  and  integrating  term  by  term. 


1 7.2  Applications  of  the  Double  Integral 

For  modelling  purposes  it  is  useful  to  introduce  a  simplified  notation  for  Riemann 
sums.  In  the  case  of  equidistant  partitions  Zx,  Zy  where  all  subintervals  have  the 
same  lengths,  one  writes 

Ax  =  Xi  -Xi- 1,  Ay  =  yj  -  yj-\ 


and  calls 


A  A  =  Ax  Ay 

the  area  element  of  the  grid  G.  If  one  then  takes  the  right  upper  corner  p7y  =  (v7 ,  yj ) 
of  the  subrectangle  [x7_i,  v7]  x  \yj~ i,  yj]  as  an  intermediate  point,  the  correspond¬ 
ing  Riemann  sum  reads 


n  m  n  m 

s  =  yrAA  =  yrAxAy- 

i= 1  j= 1  i= 1  7  =  1 

Application  17.10  (Mass  as  integral  of  the  density)  A  thin  plane  object  D  has 
density  p(x,  y)  [mass/unit  area]  at  the  point  (x,  y).  If  the  density  p  is  constant  every¬ 
where  then  its  total  mass  is  simply  the  product  of  density  and  area.  In  the  case  of 
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variable  density  (e.g.  due  to  a  change  of  the  material  properties  from  point  to  point), 
we  partition  D  in  smaller  rectangles  with  sides  Ax,  Ay.  The  mass  contained  in  such 
a  small  rectangle  around  (x,  y)  is  approximately  equal  to  p(x,  y)  Ax  Ay.  The  total 
mass  is  thus  approximately  equal  to 


n  m 

yj)Ax  Ay. 

i= 1  7=1 


However,  this  is  just  a  Riemann  sum  for 


M  = 


p(x,  y)dx  dy. 


This  consideration  shows  that  the  integral  of  the  density  function  is  a  feasible  model 
for  representing  the  total  mass  of  a  two-dimensional  object. 


Application  17.11  (Centre  of  gravity)  We  consider  a  two-dimensional  flat  object 
D  as  in  Application  17.10.  The  two  statical  moments  of  a  small  rectangle  close  to 
(x,  y)  with  respect  to  a  point  (x*,  y*)  are 

(x  -  x*)p(x,  y)  dx  Ay,  (y  -  y*)p(x,  y)  Ax  Ay, 


see  Fig.  17.7. 

The  relevance  of  the  statical  moments  can  be  seen  if  one  considers  the  object  under 
the  influence  of  gravity.  Multiplied  by  the  gravitational  acceleration  g  one  obtains 
the  moments  of  force  with  respect  to  the  axes  through  (x*,  y*)  in  direction  of  the 
coordinates  (force  times  lever  arm).  The  centre  of  gravity  of  the  two-dimensional 
object  D  is  the  point  (xg ,  ys)  with  respect  to  which  the  total  statical  moments  vanish: 


n  m  n  m 

X  X  -  xs)  PVi >  yj)Ax  Ay  ss  0,  W(y7-  -  ys)  p(*i ,  yj )  dx  dy  0. 

i= 1  7  =  1  i= 1  7  =  1 


In  the  limit,  as  the  mesh  size  of  the  grid  tends  to  zero,  one  obtains 


xs)p(x,  y)  dx  dy  =  0, 


ys)p(x,  y)  dx  dy  =  0 


Fig.  1 7.7  The  statical 
moments 


►  x 
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Fig.  17.8  Centre  of  gravity 
of  the  quarter  circle 


V 


as  defining  equations  for  the  centre  of  gravity;  i.e., 


xS  =  Jjff  xp(x,y)dxdy,  ys 


1 

M 


yp(x,  y)Axdy, 


where  M  denotes  the  total  mass  as  in  Application  17.10. 

For  the  special  case  of  a  constant  density  p(x,  y)  =  1  one  obtains  the  geometric 
centre  of  gravity  of  the  domain  D. 


Example  17.12  (Geometric  centre  of  gravity  of  a  quarter  circle)  Let  D  be  the  quar¬ 
ter  circle  of  radius  r  about  (0,  0)  in  the  first  quadrant;  i.e.,  D  =  {(x,  y)  ;  0  <  v  < 
r,  0  <  y  <  Vr2  —  x2}  (Fig.  17.8).  With  density  p(x,  y)  =  1  one  obtains  the  area  M 
as  r27r / 4.  The  first  statical  moment  is 


The  v -coordinate  of  the  centre  of  gravity  is  thus  given  by  x$  =  ^  •  \r^  =  For 
reasons  of  symmetry,  one  has  ys  =  *s  • 


1 7.3  The  Transformation  Formula 

Similar  to  the  substitution  rule  for  one-dimensional  integrals  (Sect.  10.2),  the  trans¬ 
formation  formula  for  double  integrals  makes  it  possible  to  change  coordinates  on 
the  domain  D  of  integration.  For  the  purpose  of  this  section  it  is  convenient  to  assume 
that  D  is  an  open  subset  of  M2  (see  Definition  9.1). 

Definition  17.13  A  bijective,  differentiable  mapping  F  :  D  — >  B  =  F (D)  between 
two  open  subsets  D,  B  C  M2  is  called  a  diffeomorphism  if  the  inverse  mapping  F-1 
is  also  differentiable. 
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Fig.  17.9  Transformation  of 
a  planar  domain 


We  use  the  following  notation  for  the  variables: 


u 

\-+ 

X 

x(u ,  v) 

V 

_y_ 

y(u,  v)_ 

Figure  17.9  shows  the  image  B  of  the  domain  D  =  (0,  1)  x  (0,  1)  under  the  trans¬ 
formation 


u 

X 

u  +  v/4 

V 

_y_ 

m/4  +  v  +  u2v2 

The  aim  is  to  transform  the  integral  of  a  real- valued  function  /  over  the  domain  B 
to  one  over  D. 

For  this  purpose  we  lay  a  grid  G  over  the  domain  D  in  the  ( u ,  n)-plane  and  select 
a  rectangle,  for  instance  with  the  left  lower  corner  ( u ,  v )  and  sides  spanned  by  the 
vectors 


Au 

"  0  " 

0 

Av 

The  image  of  this  rectangle  under  the  transformation  F  will  in  general  have  a  curvi¬ 
linear  boundary.  In  a  first  approximation  we  replace  it  by  a  parallelogram.  In  linear 
approximation  (see  Sect.  15.4)  we  have  the  following: 


F (u  +  Au,  v )  ~  F (u,  v)  +  F '  {u,  v ) 
F (u,  v  +  Av)  ~  F (u,  n)  +  F f(u,  u) 


Au 

0  J  ’ 
0  " 

Av 


The  approximating  parallelogram  is  thus  spanned  by  the  vectors 


Av 


and  has  the  area  (see  Appendix  A. 5) 


det 


C u ,  v) 
(w,  v) 


dx 


dv 

<9y 

dv 


(u,  v) 
(w,  v) 


Au  Av 


detF \u,  v) 


Au  Av. 
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Fig.  17.10  Transformation 
of  an  area  element 


In  short,  the  area  element  A  A  =  Au  A  v  is  changed  by  the  transformation  F  to  the 
area  element  A F(A)  =  detF '(u,v)  Au  Av  (see  Fig.  17.10). 


Proposition  17.14  (Transformation  formula  for  double  integrals)  Let  D,  B  be 
open,  bounded  subsets  of  M2,  F  :  D  —>  B  a  diffeomorphism  and  f  :  B  —>  R  a 
bounded  mapping.  Then 


fix,  y)  dxdy 


Jj  /(F(m,  i>))  jdetF^M,  v)  dudv. 


as  long  as  the  functions  f  and  /(F)  detF7  are  Riemann  integrable. 


Outline  of  the  proof  We  use  Riemann  sums  on  the  transformed  grid  and  obtain 


f(x,  y)dx  d y 


n  m 

i= 1  7  =  1 


n  m 


f(x(ui,  Vj),  y{ui,  Vjj)  |detF'(M/,  vf)\AuAv 


i= i  j= i 


JJ  f(x(u,v),y(u,v))\detF\u,v)  dudv, 


A  rigorous  proof  is  tedious  and  requires  a  careful  study  of  the  boundary  of  the  domain 
D  and  the  behaviour  of  the  transformation  F  near  the  boundary  (see  for  instance  [3, 
Chap.  19,  Theorem  4.7]).  □ 


Example  17.15  The  area  of  the  domain  B  from  Fig.  17.9  can  be  calculated  using  the 
transformation  formula  with  f(x,  y)  =  1  as  follows.  We  have 


F V  v)  = 


detF '(u,  v) 


1  1/4 

1/4  +  2  uv2  1  +  2  u2v 


15  7  1  7 

- 1-  2 u  v - uv 

16  2 
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and  thus 


dx  d  y  = 


a, 

i'm: 


detF \u,v)  dudv 


- 1-  2 u2v - uv 2  )  dv  I  du 

16  2 


-L 


15 

16 


+  u2 - u  I  d u  = 


15  1 


1 


16  +  3  12 


19 

16' 


Example  17.16  (Volume  of  a  hemisphere  in  polar  coordinates)  We  represent  a 
hemisphere  of  radius  R  by  the  three-dimensional  domain 

{(x,  y,z)  ;  o  <  X2  +  y2  <  R2,  0  <  z  <  yj R2  -  x2  -  y2}. 

Its  volume  is  obtained  by  integration  of  the  function  fix,  y )  =  yj R2  —  x2  —  y2  over 
the  base  B  =  {(x,  y)  ;  0  <  x2  +  y2  <  R2}.  In  polar  coordinates 


F  :  M2 


r  cos  Lp 
r  sin  (p 


the  area  B  can  be  represented  as  the  image  F (D)  of  the  rectangle  D  =  [0,  R]  x 
[0,  2tt].  However,  in  order  to  fulfil  the  assumptions  of  Proposition  17.14  we  have 
to  switch  to  open  domains  on  which  F  is  a  diffeomorphism.  We  can  obtain  this,  for 
instance,  by  removing  the  boundary  and  the  half  ray  {(v,  y)  ;  0  <  v  <  R,  y  =  0} 
of  the  circle  B  and  the  boundary  of  the  rectangle  D.  On  the  smaller  domains  D' ,  B! 
obtained  in  this  way,  F  is  a  diffeomorphism.  However,  since  B  differs  from  B’  and 
D  differs  from  D'  by  sets  of  measure  zero,  the  value  of  the  integral  is  not  changed 
if  one  replaces  B  by  B'  and  D  by  D',  see  Remark  17.6.  We  have 


F V,  ip) 


cos  cp  —r  sin  cp 
sin  (p  r  cos  <p 


det  Fr(r,  p) 


=  r. 


Substituting  x  =  r  cos  ip,  y  =  r  sin  Lp  results  in  x2  +  y2  =  r 2  and  we  obtain  the  vol¬ 
ume  from  the  transformation  formula  as 


I*  R  _ 

/  /  \  R2  —  r2rdpdr 

Jo  Jo 

nR 

I  2n r\J R2  —  r2  dr 

Jo 


-y(r2-M 


2\3/2 


r=R  2tt  o 

=  — R3, 
r= 0  3 


which  coincides  with  the  known  result  from  elementary  geometry. 
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1 7.4  Exercises 


1.  Compute  the  volume  of  the  parabolic  dome  z  =  2  —  x2  —  y2  above  the  quadratic 
domain  D  :  —1  <  x  <  l,  —l  <  y  <  l. 

2.  (From  statics)  Compute  the  axial  moment  of  inertia  ffD  y 2  dx  dy  of  a  rectangular 
cross  section  D  \  0  <  x  <  b,  —h/2  <  y  <  /i/2,  where  b  >  0,  h  >  0. 

3.  Compute  the  volume  of  the  body  bounded  by  the  plane  z  =  x  +  y  above  the 
domain  Z):0<x<l,0<y<  VT^2. 

4.  Compute  the  volume  of  the  body  bounded  by  the  plane  z  =  6  —  v  —  y  above 
the  domain  D ,  which  is  bounded  by  the  y-axis  and  the  straight  lines  v  +  y  =  6, 
v  +  3y  =  6  (x  >  0,  y  >  0). 

5.  Compute  the  geometric  centre  of  gravity  of  the  domain  Z):0<x<l,0<y< 
1  —  x2. 

6.  Compute  the  area  and  the  geometric  centre  of  gravity  of  the  semi-ellipse 


x2  y2 
a2  b2 


y  >  0. 


Hint.  Introduce  elliptic  coordinates  x  =  ar  cos  cp,  y  =  br  sin  cp,  0  <  r  <  1,  0  < 
p  <7 r,  compute  the  Jacobian  and  use  the  transformation  formula. 

7.  (From  statics)  Compute  the  axial  moment  of  inertia  of  a  ring  with  inner  radius 
R\  and  outer  radius  R2  with  respect  to  the  central  axis,  i.e.  the  integral  ffD(x2  + 

y2)  dx  dy  over  the  domain  D  :  R\  <  ^ x2  +  y2  <  R2. 

8.  Modify  the  M-file  matl7_l  .m  so  that  it  can  evaluate  Riemann  sums  over 
equidistant  partitions  with  Ax  7^  Ay. 

9.  Let  the  domain  D  be  bounded  by  the  curves 

y  =  v  and  y  =  x  ,  0  <  v  <  1 . 


(a)  Sketch  D. 

(b)  Compute  the  area  of  D  by  means  of  the  double  integral  F  =  ffD  d(x,  y). 

(c)  Compute  the  statical  moments  f fD  x  d(v,  y)  und  ffDy  d(x,  y). 

10.  Compute  the  statical  moment  ffDy  d(v,  y)  of  the  half-disk 

D  =  {(*,  y)  el2;  -1  <  x  <  1,  0  <  y  <  y/\  -  x2} 


(a)  as  a  double  integral,  writing  D  as  a  normal  domain  of  type  I; 

(b)  by  transformation  to  polar  coordinates. 

11.  The  following  integral  is  written  in  terms  of  a  normal  domain  of  type  II: 


1  y2+ 1 


x2y  dvdy, 


0  7 
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(a)  Compute  the  integral. 

(b)  Sketch  the  domain  and  represent  it  as  a  normal  domain  of  type  I. 

(c)  Interchange  the  order  of  integration  and  recompute  the  integral. 
Hint.  In  (c)  two  summands  are  needed. 


® 

Check  for 
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Linear  Regression 


Linear  regression  is  one  of  the  most  important  methods  of  data  analysis.  It  serves 
the  determination  of  model  parameters,  model  fitting,  assessing  the  importance  of 
influencing  factors,  and  prediction,  in  all  areas  of  human,  natural  and  economic 
sciences.  Computer  scientists  who  work  closely  with  people  from  these  areas  will 
definitely  come  across  regression  models. 

The  aim  of  this  chapter  is  a  first  introduction  into  the  subject.  We  deduce  the 
coefficients  of  the  regression  models  using  the  method  of  least  squares  to  minimise 
the  errors.  We  will  only  employ  methods  of  descriptive  data  analysis.  We  do  not  touch 
upon  the  more  advanced  probabilistic  approaches  which  are  topics  of  statistics.  For 
that,  as  well  as  for  nonlinear  regression,  we  refer  to  the  specialised  literature. 

We  start  with  simple  (or  univariate)  linear  regression — a  model  with  a  single  input 
and  a  single  output  quantity — and  explain  the  basic  ideas  of  analysis  of  variance  for 
model  evaluation.  Then  we  turn  to  multiple  (or  multivariate)  linear  regression  with 
several  input  quantities.  The  chapter  closes  with  a  descriptive  approach  to  determine 
the  influence  of  the  individual  coefficients. 


1 8.1  Simple  Linear  Regression 

A  first  glance  at  the  basic  idea  of  linear  regression  was  already  given  in  Sect.  8.3.  In 
extension  to  this,  we  will  now  allow  more  general  models,  in  particular  regression 
lines  with  nonzero  intercept. 

Consider  pairs  of  data  (x\,  y\ ),...,  (xn,  yn),  obtained  as  observations  or  mea¬ 
surements.  Geometrically  they  form  a  scatter  plot  in  the  plane.  The  values  Xi  and  y* 
may  appear  repeatedly  in  this  list  of  data.  In  particular,  for  a  given  v;  there  can  be 
data  points  with  different  values  yn,  ,  ytp .  The  general  task  of  linear  regression 
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1 8  Linear  Regression 


Fig.  18.1  Scatter  plot  height/weight,  line  of  best  fit,  best  parabola 


is  to  fit  the  graph  of  a  function 

y  =  Po<Po(x)  +  Pl<Pl(x)  H - h  Pm<Pm(x) 

to  the  n  data  points  (x\,  y\ ),...,  (xn,  yn).  Here  the  shape  functions  <pj(x)  are  given 
and  the  (unknown)  coefficients  Pj  are  to  be  determined  such  that  the  sum  of  squares 
of  the  errors  is  minimal  ( method  of  least  squares ): 

n 

2Z  ( y>  ~  Po<Po(xi)  -  Pm(xi) - Pm<Pm(.Xi ))  -*  min 

i  =  1 

The  regression  is  called  linear  because  the  function  y  depends  linearly  on  the 
unknown  coefficients  fij.  The  choice  of  the  shape  functions  ensues  either  from  a 
possible  theoretical  model  or  empirically,  where  different  possibilities  are  subjected 
to  statistical  tests.  The  choice  is  made,  for  example,  according  to  the  proportion  of 
data  variability  which  is  explained  by  the  regression — more  about  that  in  Sect.  18.4. 
The  standard  question  of  (simple  or  univariate)  linear  regression  is  to  fit  a  linear 
model 


y  =  Pq  +  P\x 

to  the  data,  i.e.,  to  find  the  line  of  best  fit  or  regression  line  through  the  scatter  plot. 

Example  18.1  A  sample  of  n  =  70  computer  science  students  at  the  University  of 
Innsbruck  in  2002  yielded  the  data  depicted  in  Fig.  18.1.  Here  v  denotes  the  height 
[cm]  and  y  the  weight  [kg]  of  the  students.  The  left  picture  in  Fig.  18.1  shows  the 
regression  line  y  =  fo  +  P\x,  the  right  one  a  fitted  quadratic  parabola  of  the  form 

y  =  Po  +  Pix  +  fox2 . 

Note  the  difference  to  Fig.  8.8  where  the  line  of  best  fit  through  the  origin  was  used; 
i.e.,  the  intercept  fo  was  set  to  zero  in  the  linear  model. 
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64  66  68  70  72  74  76  fathers  [inch] 

Fig.  18.2  Scatter  plot  height  of  fathers/height  of  the  sons,  regression  line 


A  variant  of  the  standard  problem  is  obtained  by  considering  the  linear  model 


>7  =  A)  + 


for  the  transformed  variables 


§  =  <p(x),  r)  =  f(y). 

Formally  this  problem  is  identical  to  the  standard  problem  of  linear  regression,  how¬ 
ever,  with  transformed  data 


(§;,  m)  =  {<p(xi),  f(yi)). 

A  typical  example  is  given  by  the  loglinear  regression  with  §  =  log  x,  r)  —  log  y 

log  y  =  Po  +  Pi  log  x , 

which  in  the  original  variables  amounts  to  the  exponential  model 

y  = 

If  the  variable  x  itself  has  several  components  which  enter  linearly  in  the  model,  then 
one  speaks  of  multiple  linear  regression.  We  will  deal  with  it  in  Sect.  18.3. 

The  notion  of  regression  was  introduced  by  Galton  who  observed,  while  inves¬ 
tigating  the  height  of  sons/fathers,  a  tendency  of  regressing  to  the  average  size.  The 
data  taken  from  [15]  clearly  show  this  effect,  see  Fig.  18.2.  The  method  of  least 
squares  goes  back  to  Gauss. 

After  these  introductory  remarks  about  the  general  concept  of  linear  regression, 
we  turn  to  simple  linear  regression.  We  start  with  setting  up  the  model.  The  postulated 
relationship  between  x  and  y  is  linear 


1F.  Galton,  1822-1911. 


y  =  A)  +  p \x 
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Fig.  18.3  Linear  model  and 
error  Si 


with  unknown  coefficients  fo  and  fi\.  In  general,  the  given  data  will  not  exactly  lie 
on  a  straight  line  but  deviate  by  i.e., 

yi  —  Po  +  xi  +  £i  > 

as  represented  in  Fig.  18.3. 

From  the  given  data  we  want  to  obtain  estimated  values  fo,  Pi  for  fo,  Pi .  This  is 
achieved  through  minimising  the  sum  of  squares  of  the  errors 

n  n 

l(Po,  po  =  y]  sf  =  -  Po  -  Pm)2, 

i  =  1  i  =  l 

so  that  po,  Pi  solve  the  minimisation  problem 

L(Po,  Pi)  =  min  f(p0,  pi)  ;  p0  e  R,  pi  e  r). 

We  obtain  Po  and  P\  by  setting  the  partial  derivatives  of  L  with  respect  to  Po  and  P\ 
to  zero: 

d  L 

T3~(fio,  Pi)  =  -2Y ~](yi  -  Po  ~  P\Xi)  =  0, 

dp0  tr 

9L  ^ 

—  (An  A)  =  -2  V]  xi  ( y,  -  Po-  PiXi)  =  0. 

This  leads  to  a  linear  system  of  equations  for  /3o,  ft,  the  so-called  normal  equations 

n  Po  +(J2xi)Pi  =  Eyi- 

(E*;)A>  +  (E*?  )A  =  E*/^- 


Proposition  18.2  Assume  that  at  least  two  x -values  in  the  data  set  (ft ,  yft  i  = 
l,  ...  ,n  are  different.  Then  the  normal  equations  have  a  unique  solution 


ft  =  (1 E  ?/)-(£  E*«-)ft.  ft 


E*!^' 


|  E  *i  E 

LE*;)2 


which  minimises  the  sum  of  squares  L(p  o,  ft)  of  the  errors. 
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Fig.  18.4  Linear  model, 
prediction,  residual 


Proof  With  the  notations  x  =  (x\,  . . . ,  xn)  and  1  =  (1,  . . . ,  1)  the  determinant  of  the 
normal  equations  is  njfxf  —  Cffxi)1  =  ||x||2||l||2  —  (x,  l)2.  For  vectors  of  length 
n  =  2  and  ft  =  3  we  know  that  (x,  1)  =  ||x||  ||1||  •  cos  Z(x,  1),  see  Appendix  A.4,  and 
thus  ||x ||  ||1||  >  |  (x,  1)  |.  This  relation,  however,  is  valid  in  any  dimension  n  (see  for 
instance  [2,  Chap.  VI,  Theorem  1.1]),  and  equality  can  only  occur  if  x  is  parallel  to 
1,  so  all  components  X[  are  equal.  As  this  possibility  was  excluded,  the  determinant 
of  the  normal  equations  is  greater  than  zero  and  the  solution  formula  is  obtained  by 
a  simple  calculation. 

In  order  to  show  that  this  solution  minimises  L(Po,  ft i),  we  compute  the  Hessian 
matrix 


r  d2L 

d2L  I 

'mu2 

_<X,  1) 

— 1 

Hl  = 

3  2L 

9  A)  dfii 

92L 

=  2 

ft 

i 

J2xi 

T,x?_ 

=  2 

(x,  1) 

||x||2 

_  9 A) 

3 pj  J 

The  entry  d2L/df 3q  =  2 n  and  det  Hi  =  4  ( ||x || 2 1|  1 1| 2  —  (x,  l)2)  are  both  positive. 
According  to  Proposition  15.28,  L  has  an  isolated  local  minimum  at  the  point 
(Po,  Pi)-  Due  to  the  uniqueness  of  the  solution,  this  is  the  only  minimum  of  L.  □ 

The  assumption  that  there  are  at  least  two  different  v; -values  in  the  data  set  is  not 
a  restriction  since  otherwise  the  regression  problem  is  not  meaningful.  The  result  of 
the  regression  is  the  predicted  regression  line 

y  =  Po  +  Pix. 

The  values  predicted  by  the  model  are  then 

yi  =  Po  +  Pixi,  i  =  1, . . . ,  ft. 

Their  deviations  from  the  data  values  yi  are  called  residuals 

ei  =  yt  -fi  =  yt  -  Po  -  Pixi,  i  =  1, . . . ,  ft. 

The  meaning  of  these  quantities  can  be  seen  in  Fig.  18.4. 

With  the  above  specifications,  the  deterministic  regression  model  is  completed.  In 
the  statistical  regression  model  the  errors  Si  are  interpreted  as  random  variables  with 
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mean  zero.  Under  further  probabilistic  assumptions  the  model  is  made  accessible  to 
statistical  tests  and  diagnostic  procedures.  As  mentioned  in  the  introduction,  we  will 
not  pursue  this  path  here  but  remain  in  the  framework  of  descriptive  data  analysis. 

In  order  to  obtain  a  more  lucid  representation,  we  will  reformulate  the  normal 
equations.  For  this  we  introduce  the  following  vectors  and  matrices: 


y  = 

yi 

,  X  = 

1  X\ 

1  V2 

.  P  = 

A) 

A. 

,  e  = 

£ 2 

_yn_ 

_1  Xn  _ 

_£fi  _ 

By  this,  the  relations 

yt  =  Po  +  Pi Xi  +£i,  i  =  1, . . . ,  n, 


can  be  written  simply  as 

y  =  Xp  +  e. 

Further 


XTX  = 


1  X\ 

"i  i  ...  r 

1  X2 

1 

X 

w 

1 _ 

X\  X2  •  •  •  Xyi 

*  * 

1 

CJ 

X 

w 

• 

X 

w 

_ 1 

_1  xn  _ 

yt 

i  i  ...  r 

yi 

1 

•  ^ 

w 

1 _ 

X\  X2  •  •  •  Xyi 

: 

J2xiyi_ 

_yjn  _ 

so  that  the  normal  equations  take  the  form 

XTX/?  =  XTy 


with  solution 

fi  =  (XTX)~l  XTy. 

The  predicted  values  and  residuals  are 

y  =  X/?,  e  =  y-y. 

Example  18.3  (Continuation  of  Example  18.1)  The  data  for  v  =  height  and  y  = 
weight  can  be  found  in  the  M-file  mat  0  8_3  .  m;  the  matrix  X  is  generated  in  MATLAB 
by 


X  =  [ones ( size (x) ) ,  x] ; 
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the  regression  coefficients  are  obtained  by 

beta  =  inv  (X '  *  X)  *X'*y; 

The  command  beta  =  X\y  permits  a  more  stable  calculation  in  MATLAB.  In  our 
case  the  result  is 

%  =  -85.02, 

31  =  0.8787. 

This  gives  the  regression  line  depicted  in  Fig.  18.1. 


1 8.2  Rudiments  of  the  Analysis  of  Variance 

First  indications  for  the  quality  of  fit  of  the  linear  model  can  be  obtained  from 
the  analysis  of  variance  (ANOVA),  which  also  forms  the  basis  for  more  advanced 
statistical  test  procedures. 

The  arithmetic  mean  of  the  y -values  y\ ,  . . . ,  yn  is 

1  n 

y  =  -£  y„ 

n  “ 

i  —  l 

The  deviation  of  the  measured  value  y;  from  the  mean  value  y  is  y;  —  y.  The  total 
sum  of  squares  or  total  variability  of  the  data  is 

n 

Syy  =  yy.V,  -  y )2- 
i= 1 

The  total  variability  is  split  into  two  components  in  the  following  way: 

XA  -  y )2  =  X^ _  + X-Vi  _  ^)2- 

i= 1  i  =  1  i  =  1 

The  validity  of  this  relationship  will  be  proven  in  Proposition  18.4  below.  It  is  inter¬ 
preted  as  follows:  yi  —  y  is  the  deviation  of  the  predicted  value  from  the  mean  value, 
and 

n 

ssR  =  -  y )2 

i= 1 

the  regression  sum  of  squares.  This  is  interpreted  as  the  part  of  the  data  variability 
accounted  for  by  the  model.  On  the  other  hand  ei  —  y/  —  yi  are  the  residuals,  and 

n 

ssE  =  J2(y<  -  yrf 

i  =  1 
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is  the  error  sum  of  squares  which  is  interpreted  as  the  part  of  the  variability  that 
remains  unexplained  by  the  linear  model.  These  notions  are  best  explained  by  con¬ 
sidering  the  two  extremal  cases. 

(a)  The  data  values  y;  themselves  already  lie  on  a  straight  line.  Then  all  yi  =  y;  and 
thus  =  SSr,  SSe  =  0,  and  the  regression  model  describes  the  data  record 
exactly. 

(b)  The  data  values  are  in  no  linear  relation.  Then  the  line  of  best  fit  is  the  horizontal 
line  through  the  mean  value  (see  Exercise  13  of  Chap.  8),  so  yi  =  y  for  all  i 
and  hence  =  SSe,  SSr  =  0.  This  means  that  the  regression  model  does  not 
offer  any  indication  for  a  linear  relation  between  the  values. 

The  basis  of  these  considerations  is  the  validity  of  the  following  formula. 

Proposition  18.4  (Partitioning  of  total  variability)  Syy  =  SSr  +  SSe- 

Proof  In  the  following  we  use  matrix  and  vector  notation.  In  particular,  we  employ 
the  formulas 

aTb  =  bTa  =  afti ,  lTa  =  aTl  =  ^  a;  =  nd,  aTa  =  af 


for  vectors  a,  b,  and  the  matrix  identity  (AB)T  =  BTAT.  We  have 

Syy  =  (y  -  yi)T(y  -  yl)  =  yTy  -  y(lTy)  -  (yTi)y  +  «y2 

=  yTy  -  ny2  -  ny2  +  ny2  =  yTy  -  ny2, 

SSE  =  eTe  =  (y  -  y)T(y  -y)  =  (j-  X/J)T(y  -  Xfi) 

=  yTy  -  pTXTy  -  yTXp  +  fxJXp  =  yTy  -  fxTy. 

For  the  last  equality  we  have  used  the  normal  equations  XTX/?  =  XTy  and  the 

transposition  formula  /?TXTy  =  (yTX/?)T  =  yTXjS.  The  relation "y  =  Xp  implies  in 
particular  XTy"  =  XTy.  Since  the  first  line  of  XT  consists  of  ones  only,  it  follows  that 
lTy"  =  lTy  and  thus 

SSr  =  (y  -  yl)T(y  -  yl)  =  yTy  -  y(lTy)  -  (yTl)y  +  «y2 

=  yTy  -  ny2  -  ny2  +  ny 2  =  pT(XTXp)  -  ny 2  =  /fxTy  -  ny2. 

Summation  of  the  obtained  expressions  for  SSe  and  SSr  results  in  the  sought  after 
formula.  □ 

The  partitioning  of  total  variability 


Syy  =  SSR  +  SSE 
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and  its  above  interpretation  suggests  using  the  quantity 

r2=SSr 

Syy 

for  the  assessment  of  the  goodness  of  fit.  The  quantity  R2  is  called  coefficient  of 
determination  and  measures  the  fraction  of  variability  explained  by  the  regression. 
In  the  limiting  case  of  an  exact  fit,  where  the  regression  line  passes  through  all  data 
points,  we  have  SSe  =  0  and  thus  R2  =  1.  A  small  value  of  R2  indicates  that  the 
linear  model  does  not  fit  the  data. 

Remark  18.5  An  essential  point  in  the  proof  of  Proposition  18.4  was  the  property  of 
XT  that  its  first  line  was  composed  of  ones  only.  This  is  a  consequence  of  the  fact  that 
/3o  was  a  model  parameter.  In  the  regression  where  a  straight  line  through  the  origin 
is  used  (see  Sect.  8.3)  this  is  not  the  case.  For  a  regression  which  does  not  have  fio  as 
a  parameter  the  variance  partition  is  not  valid  and  the  coefficient  of  determination  is 
meaningless. 

Example  18.6  We  continue  the  investigation  of  the  relation  between  height  and 
weight  from  Example  18.1.  Using  the  MATLAB  program  matl8_l  .m  and  entering 
the  data  from  mat  0  8_3  .  m  results  in 

Syy  =  9584.9,  SSE  =  8094.4,  SSR  =  1490.5 

and 

R2  =  0.1555,  R  =  0.3943. 

The  low  value  of  R2  is  a  clear  indication  that  height  and  weight  are  not  in  a  linear 
relation. 

Example  18.7  In  Sect.  9.1  the  fractal  dimension  d  =  d(A)  of  a  bounded  subset  A  of 
M2  was  defined  by  the  limit 

d  =  d(A)  =  —  lim  log  N (A,  s)/ logs, 

£^0+ 

where  N(A,  s)  denoted  the  smallest  number  of  squares  of  side  length  s  needed  to 
cover  A.  For  the  experimental  determination  of  the  dimension  of  a  fractal  set  A,  one 
rasters  the  plane  with  different  mesh  sizes  s  and  determines  the  number  N  =  N(A ,  s) 
of  boxes  that  have  a  non-empty  intersection  with  the  fractal.  As  explained  in  Sect.  9.1, 
one  uses  the  approximation 


N(A,s)  C-e~d. 


Applying  logarithms  results  in 


1 

logA(A,£)  ^  logC  +  dlog-, 

s 
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which  is  a  linear  model 


y  ~  Po  +  Pi* 

for  the  quantities  v  =  log  1/s,  y  =  log  N(A,  s).  The  regression  coefficient  P\  can 
be  used  as  an  estimate  for  the  fractal  dimension  d. 

In  Exercise  1  of  Sect.  9.6  this  procedure  was  applied  to  the  coastline  of  Great 
Britain.  Assume  that  the  following  values  were  obtained: 


l/E 

4 

8 

12 

16 

24 

32 

N(A,e ) 

16 

48 

90 

120 

192 

283 

A  linear  regression  through  the  logarithms  v  =  log  1/s,  y  =  log  N(A,  s)  yields  the 
coefficients 


Po  =  0.9849,  d^pi  =  1.3616 
with  the  coefficient  of  determination 

R 2  =  0.9930. 

This  is  very  good  fit,  which  is  also  confirmed  by  Fig.  18.5.  The  given  data  thus 
indicate  that  the  fractal  dimension  of  the  coastline  of  Great  Britain  is  d  =  1.36. 

A  word  of  caution  is  in  order.  Data  analysis  can  only  supply  indications,  but  never 
a  proof  that  a  model  is  correct.  Even  if  we  choose  among  a  number  of  wrong  models 
the  one  with  the  largest  R 2,  this  model  will  not  become  correct.  A  healthy  amount  of 
skepticism  with  respect  to  purely  empirically  inferred  relations  is  advisable;  models 
should  always  be  critically  questioned.  Scientific  progress  arises  from  the  interplay 
between  the  invention  of  models  and  their  experimental  validation  through  data. 


Fig.  18.5  Fractal  dimension  of  the  coastline  of  Great  Britain 
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In  multiple  (multivariate)  linear  regression  the  variable  y  does  not  just  depend  on 
one  regressor  variable  x,  but  on  several  variables,  for  instance  x\,  X2,  . . . ,  x%.  We 
emphasise  that  the  notation  with  respect  to  Sect.  18.1  is  changed;  there  x;  denoted 
the  i  th  data  value,  and  now  jc *  refers  to  the  i  th  regressor  variable.  The  measurements  of 
the  i  th  regressor  variable  are  now  denoted  with  two  indices,  namely  xn ,  x& ,  . . . ,  Xin . 
In  total,  there  are  k  x  n  data  values.  We  again  look  for  a  linear  model 

y  —  Po  +  Pix\  +  P2X2  +  •  •  •  +  Pk%k 

with  the  yet  unknown  coefficients  Po,  P\ ,  . . . ,  fik- 

Example  18.8  A  vending  machine  company  wants  to  analyse  the  delivery  time, 
i.e.,  the  time  span  y  which  a  driver  needs  to  refill  a  machine.  The  most  impor¬ 
tant  parameters  are  the  number  x\  of  refilled  product  units  and  the  distance  X2 
walked  by  the  driver.  The  results  of  an  observation  of  25  services  are  given  in 
the  M-file  matl8_3  .m.  The  data  values  are  taken  from  [19].  The  observations 
(*11,  *21),  (*12,  *22),  (*13,  ^23),  •  •  • ,  (^1,25,  ^2,25)  with  the  corresponding  service 
times  yi,  y2>  •  •  • »  J25  yield  a  scatter  plot  in  space  to  which  a  plane  of  the  form 

y  =  fio  +  p\x\  +  P2X2  should  be  fitted  (Fig.  18.6;  use  the  M-file  matl8_4  .m  for 
visualisation). 

Remark  18.9  A  special  case  of  the  general  multiple  linear  model  y  =  ft 0  +  + 

•  •  •  +  Pk*k  is  simple  linear  regression  with  several  nonlinear  form  functions  (as 
mentioned  in  Sect.  18.1),  i.e., 

y  =  Po  +  Pi<Pi(x)  +  P2V2W  H - 8  Pk<Pk CO, 

where  x\  =  (p i(x),  X2  =  <p2(x),  •  •  •  ,  Xk  =  (pk(x)  are  considered  as  regressor  vari¬ 
ables.  In  particular  one  can  allow  polynomial  models 

y  =  Po  +  P\X  +  P2X2  +  •  •  •  +  PkXk 


Fig.  1 8.6  Multiple  linear 
regression  through  a  scatter 
plot  in  space 
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or  still  more  general  interactions  between  several  variables,  for  instance 

y  =  P  o  +  P\x\  +  Pix2  +  foxix2. 

All  these  cases  are  treated  in  the  same  way  as  the  standard  problem  of  multiple  linear 
regression,  after  renaming  the  variables. 

The  data  values  for  the  individual  regressor  variables  are  schematically  repre¬ 
sented  as  follows: 


Variable 

y 

X\ 

X2 

.  .  .  Xfc 

Observation  1 

y\ 

*11 

*21 

...  *&1 

Observation  2 

yi 

*12 

*22 

...  Xk2 

Observation  n 

yn 

X\n 

*2  n 

.  .  .  Xfcfi 

Each  value  yi  is  to  be  approximated  by 


yi  —  Po  +  P\X\ i  +  P2X2 i  +  •  •  •  +  Pkxki  +  £/  >  i  —  1,  •  •  • ,  n 

with  the  errors  The  estimated  coefficients  Po,  Pu  -  •  • ,  Pk  are  again  obtained  as 
the  solution  of  the  minimisation  problem 


n 

Lifio,  0i, . . . ,  0*)  =  y]  e]  -*  min 

i= 1 

Using  vector  and  matrix  notation 


yi 

1  X 1 1  X21 

.  .  .  Xfc  | 

00 

y  = 

yi 

,  X  = 

1  X12  X22 

.  .  .  Xk2 

,  0  = 

0i 

,  e  = 

£ 2 

_yn_ 

_1  %\n  %2n 

.  .  .  Xfcfi^ 

h_ 

_£/?_ 

the  linear  model  can  again  be  written  for  short  as 

y  =  X0  +  e. 

The  coefficients  of  best  fit  are  obtained  as  in  Sect.  18.1  by  the  formula 

0  =  (XTXr'XTy 

with  the  predicted  values  and  the  residuals 
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The  partitioning  of  total  variability 

Syy  =  SSR  +  SSE 

is  still  valid;  the  multiple  coefficient  of  determination 

R2  =  SSR/Syy 

is  an  indicator  of  the  goodness  of  fit  of  the  model. 


Example  18.10  We  continue  the  analysis  of  the  delivery  times  from  Example  18.8. 
Using  the  MATLAB  program  matl8_2  .m  and  entering  the  data  from  matl8_3  .m 
results  in 


2.3412 

1.6159 

0.0144 


We  obtain  the  model 


f=  2.3412  +  1.6159vi  +  0.0144  v2 
with  the  multiple  coefficient  of  determination  of 

R2  =  0.9596 

and  the  partitioning  of  total  variability 

Syy  =  5784.5,  SSR  =  5550.8,  SSE  =  233.7 

In  this  example  merely  (1  —  R 2)  •  100%  ~  4%  of  the  variability  of  the  data  is  not 
explained  by  the  regression,  a  very  satisfactory  goodness  of  fit. 


1 8.4  Model  Fitting  and  Variable  Selection 

A  recurring  problem  is  to  decide  which  variables  should  be  included  in  the  model. 
Would  the  inclusion  of  V3  =  x\  and  X4  =  x  1X2,  i.e.,  the  model 

y  —  Po  +  Plxl  +  Plx2  +  H"  ^4-^  1-^2? 

lead  to  better  results,  and  can,  e.g.,  the  term  P2X2  be  eliminated  subsequently?  It  is 
not  desirable  to  have  too  many  variables  in  the  model.  If  there  are  as  many  variables 
as  data  points,  then  one  can  fit  the  regression  exactly  through  the  data  and  the  model 
would  loose  its  predictive  power.  A  criterion  will  definitely  be  to  reach  a  value  of 
R 2  which  is  as  large  as  possible.  Another  aim  is  to  eliminate  variables  that  do  not 
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contribute  essentially  to  the  total  variability.  An  algorithmic  procedure  for  identifying 
these  variables  is  the  sequential  partitioning  of  total  variability. 

Sequential  partitioning  of  total  variability.  We  include  variables  stepwise  in  the 
model,  thus  consider  the  increasing  sequence  of  models  with  corresponding  SSr : 

y  =  00  SSr(Po), 

y  =  00  +  01*1  SSR(p  o,0i), 

y  =  0o  +  01*1  +  02*2  SSr(Po,  01,  02), 


y  —  0o  +  01*1  +  02*2  +  •  •  •  +  PkXk  SSr(Po,  01,  •  •  •  ,  0&)  —  SSr. 

Note  that  SSr(Po)  =  0,  since  in  the  initial  model  0o  =  y .  The  additional  explana¬ 
tory  power  of  the  variable  x\  is  measured  by 

SSR(pl\p0)  =  SSR(p0,Pi)-0, 
the  power  of  variable  X2  (if  *i  is  already  in  the  model)  by 

SSr (02 1 0o,  01)  =  SSr(j3 o,  01, 02)  -  SSr(jS o,  01), 
the  power  of  variable  xi  (if  *i ,  *2,  •  •  • ,  *&-i  are  in  the  model)  by 

SSr  (0£  |0O,  01,  ...  ,  Pk- 1)  =  5^(00,  01,  •  •  •  ,  Pk)  -  SSr(Po,  01,  ...  ,  0&-l). 
Obviously, 


S^(0ll0o)  +  5^(02106,  0l)  +  S  Sr  (03  |0o,  01,  02)  +  •  •  • 
+  55/?(0A;|0O,  01,  02,  •  •  •  ,  0£— l)  = 


This  shows  that  one  can  interpret  the  sequential,  partial  coefficient  of  determination 


SSr  (0 j  100,  01,  ,  0;-l) 

as  explanatory  power  of  the  variables  Xj,  under  the  condition  that  the  variables 
x\,  X2,  . . . ,  Xj-i  are  already  included  in  the  model.  This  partial  coefficient  of  deter¬ 
mination  depends  on  the  order  of  the  added  variables.  This  dependency  can  be  elim¬ 
inated  by  averaging  over  all  possible  sequences  of  variables. 

Average  explanatory  power  of  individual  coefficients.  One  first  computes  all  possi¬ 
ble  sequential,  partial  coefficients  of  determination  which  can  be  obtained  by  adding 
the  variable  xj  to  all  possible  combinations  of  the  already  included  variables.  Sum¬ 
ming  up  these  coefficients  and  dividing  the  result  by  the  total  number  of  possibilities, 
one  obtains  a  measure  for  the  contribution  of  the  variable  xj  to  the  explanatory  power 
of  the  model. 
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Average  over  orderings  was  proposed  by  [16];  further  details  and  advanced  con¬ 
siderations  can  be  found,  for  instance,  in  [8,10].  The  concept  does  not  use  proba¬ 
bilistically  motivated  indicators.  Instead  it  is  based  on  the  data  and  on  combinatorics, 
thus  belongs  to  descriptive  data  analysis.  Such  descriptive  methods,  in  contrast  to  the 
commonly  used  statistical  hypothesis  testing,  do  not  require  additional  assumptions 
which  may  be  difficult  to  justify. 

Example  18.11  We  compute  the  explanatory  power  of  the  coefficients  in  the  delivery 
time  problem  of  Example  18.8.  First  we  fit  the  two  univariate  models 

y  =  P o  +  P\x\ ,  y  =  P o  +  p2%2 

and  from  that  obtain 

SSR(Po,  Pi)  =  5382.4,  SSR(p0,  p2)  =  4599.1, 

with  the  regression  coefficients  po  =  3.3208,  p\  =  2.1762  in  the  first  and  Po  = 
4.9612,  =  0.0426  in  the  second  case.  With  the  already  computed  values  of  the 

bivariate  model 


SSR(Po,  Pi,  p2)  =  SSr  =  5550.8,  Syy  =  5784.5 

from  Example  18.10  we  obtain  the  two  sequences 

SSR(Pi\Po)  =  5382.4  «  93.05%  of  Syy 
SSR(p2\Po,Pi)  =  168.4  ~  2.91%  of  Syy 

and 

SSR(p2\Po)  =  4599.1  «  79.51%  of  Syy 
SSR(Pi\p0,  p2)  =  951.7  «  16.45%  of  Syy. 

The  average  explanatory  power  of  the  variable  x\  (or  of  the  coefficient  p  i)  is 


1 

2 


(^93.05  +  16.45 Vc  =  54.75%, 


the  one  of  the  variable  x2  is 


1 

2 


(2.91+79.51 


%  =  41.21%; 


the  remaining  4.04%  stay  unexplained.  The  result  is  represented  in  Fig.  18.7. 
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Fig.  1 8.7  Average 
explanatory  powers  of  the 
individual  variables 


4.0% 


i  i  proportion  j3\ 

i  i  proportion  (3 2 

1  1  unexplained 


41.2% 


Numerical  calculation  of  the  average  explanatory  powers.  In  the  case  of  more 
than  two  independent  variables  one  has  to  take  care  that  all  possible  sequences  (rep¬ 
resented  by  permutations  of  the  variables)  are  considered.  This  will  be  exemplarily 
shown  with  three  variables  x\,  x2,  x3.  In  the  left  column  of  the  table  there  are  the 
3!  =  6  permutations  of  {1,  2,  3},  the  other  columns  list  the  sequentially  obtained 
values  of  S SR. 


1  2  3 

1  3  2 

2  1  3 

2  3  1 

3  1  2 
3  2  1 


SSR(P i\fo)  SSR(p2 \Po,Pi)  SSR(fo IfaPufo) 
SSR(Pi\Po)  SSR(fo\Po,Pi)  SSR(P2\Po,PuP3) 
SSR(P 2\Po)  SSR(P i\Po,P2)  SSR(fo \Po,P2,Pi) 
SSR(P 2\Po)  SSR(fo \Po,P2)  SSR(Pi 
SSR(fo \Po)  SSR(p{ \p0,p3)  SSR(p2 \Po,P3,Pi) 
SSR(fo \Po)  SSR(p2\p0,fo)  SSR(P i\Po,P3,Pi) 


Obviously  the  sum  of  each  row  is  always  equal  to  S SR,  so  that  the  sum  of  all  entries 
is  equal  to  6  •  SSR.  Note  that  amongst  the  18  SSR -values  there  are  actually  only  12 
different  ones. 

The  average  explanatory  power  of  the  variable  x  \  is  defined  by  M\/Syy,  where 

M\  =  ^SSRifrWo)  +  SSr^Po)  +  SSR(Pi\p0,  fo)  +  SSr^Po,  fo) 

+  SSr(Pi\Po,  P>2,  h)  +  SSr(Pi \Po,  @3,  P2 )) 


and  analogously  for  the  other  variables.  As  remarked  above,  we  have 


Mi  +M2  +  M3  =  SSR, 


and  thus  the  total  partitioning  adds  up  to  one 


Mi  M2  M3  SSr 


}yy 


}yy 


'yy 


}yy 


For  a  more  detailed  analysis  of  the  underlying  combinatorics,  for  the  necessary 
modifications  in  the  case  of  collinearity  of  the  data  (linear  dependence  of  the  columns 
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of  the  matrix  X)  and  for  a  discussion  of  the  significance  of  the  average  explanatory 
power,  we  refer  to  the  literature  quoted  above.  The  algorithm  is  implemented  in  the 
applet  Linear  regression. 

Experiment  18.12  Open  the  applet  Linear  regression  and  load  data  set  number  9. 
It  contains  experimental  data  quantifying  the  influence  of  different  aggregates  on  a 
mixture  of  concrete.  The  meaning  of  the  output  variables  x\  through  X4  and  the  input 
variables  V5  through  V13  is  explained  in  the  online  description  of  the  applet.  Exper¬ 
iment  with  different  selections  of  the  variables  of  the  model.  An  interesting  initial 
model  is  obtained,  for  example,  by  choosing  v 6,  xg,  x\o ,  in,  x\2,  V13  as  inde¬ 
pendent  and  x\  as  dependent  variable;  then  remove  variables  with  low  explanatory 
power  and  draw  a  pie  chart. 


18.5  Exercises 

1.  The  total  consumption  of  electric  energy  in  Austria  1970-2015  is  given  in  Table 
18.1  (from  [26,  Table  22.13]).  The  task  is  to  carry  out  a  linear  regression  of  the 
form  y  =  A)  +  fi\x  through  the  data. 

(a)  Write  down  the  matrix  X  explicitly  and  compute  the  coefficients  ft  = 
[An  Pi]J  using  the  MATLAB  command  beta  =  X\y. 

(b)  Check  the  goodness  of  fit  by  computing  R2 3 4 5 6.  Plot  a  scatter  diagram  with  the 
fitted  straight  line.  Compute  the  forecast  y'  for  2020. 

Table  18.1  Electric  energy  consumption  in  Austria,  year  =  Xj  ,  consumption  =  yi  [GWh] 


Xi 

1970 

1980 

1990 

2000 

2005 

2010 

2013 

2014 

2015 

yi 

23.908 

37.473 

48.529 

58.512 

66.083 

68.931 

69.934 

68.918 

69.747 

2.  A  sample  of  n  =  44  civil  engineering  students  at  the  University  of  Innsbruck  in 
the  year  1998  gave  the  values  for  v  =  height  [cm]  and  y  =  weight  [kg],  listed 
in  the  M-file  matl8_ex2  .  m.  Compute  the  regression  line  y  =  ft 0  +  fiix,  plot 
the  scatter  diagram  and  calculate  the  coefficient  of  determination  R2. 

3.  Solve  Exercise  1  using  Excel. 

4.  Solve  Exercise  1  using  the  statistics  package  SPSS. 

Hint.  Enter  the  data  in  the  worksheet  Data  View ;  the  names  of  the  variables  and 
their  properties  can  be  defined  in  the  worksheet  Variable  View.  Go  to  Analyze 
—>  Regression  —>  Linear. 

5.  The  stock  of  buildings  in  Austria  1869-2011  is  given  in  the  M-file 
matl8_ex5.m  (data  from  [26,  Table  12.01]).  Compute  the  regression  line 
y  =  A)  T  P\x  and  the  regression  parabola  y  =  ag  +  u\(x  —  I860)2  through  the 
data  and  test  which  model  fits  better,  using  the  coefficient  of  determination  R2. 

6.  The  monthly  share  index  for  four  breweries  from  November  1999  to  November 
2000  is  given  in  the  M-file  mat  1 8_ex6  .  m  (November  1999  =  100%,  from  the 
Austrian  magazine  profil  46/2000).  Fit  a  univariate  linear  model  y  =  A)  +  P\x 
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to  each  of  the  four  data  sets  (x  . . .  date,  y  . . .  share  index),  plot  the  results  in 
four  equally  scaled  windows,  evaluate  the  results  by  computing  R2  and  check 
whether  the  caption  provided  by  profil  is  justified  by  the  data.  For  the  calculation 
you  may  use  the  MATLAB  program  mat  1 8_1 .  m. 

Hint.  A  solution  is  suggested  in  the  M-file  matl8_exsol6  .  m. 

7.  Continuation  of  Exercise  5,  stock  of  buildings  in  Austria.  Fit  the  model 

y  =  Po  +  Pix  +  Pi(x  -  I860)2 

and  compute  SSr  =  SSr(I3o,  p\,  P2)  and  Syy.  Further,  analyse  the  increase  of 
explanatory  power  through  adding  the  respective  missing  variable  in  the  models 
of  Exercise  5,  i.e.,  compute  SSr(/ 32 1  A),  P\)  and  SSr(I 3i|/3o,  /fe)  as  well  as  the 
average  explanatory  power  of  the  individual  coefficients.  Compare  with  the  result 
for  data  set  number  5  in  the  applet  Linear  regression. 

8.  The  M-file  mat  1 8_ex8  .  m  contains  the  mileage  per  gallon  y  of  30  cars  depend¬ 
ing  on  the  engine  displacement  x\ ,  the  horsepower  X2,  the  overall  length  V3  and 
the  weight  X4  of  the  vehicle  (from:  Motor  Trend  1975,  according  to  [19]).  Fit 
the  linear  model 

y  —  A)  +  P\x\  +  P2X2  +  p3x3  +  Paxa 

and  estimate  the  explanatory  power  of  the  individual  coefficients  through  a  sim¬ 
ple  sequential  analysis 

SSR(P iIA)),  SSR(P 2lA),j8i),  ssR(fo \Po,Pulh),  ssR(p4 \Po,lh,lh,P3)- 

Compare  your  result  with  the  average  explanatory  power  of  the  coefficients  for 
data  set  number  2  in  the  applet  Linear  regression. 

Hint.  A  suggested  solution  is  given  in  the  M-file  matl8_exsol8  .m. 

9.  Check  the  results  of  Exercises  2  and  6  using  the  applet  Linear  regression  (data 
sets  1  and  4);  likewise  for  the  Examples  18.1  and  18.8  with  the  data  sets  8  and 
3.  In  particular,  investigate  in  data  set  8  whether  height,  weight  and  the  risk  of 
breaking  a  leg  are  in  any  linear  relation. 

10.  Continuation  of  Exercise  14  from  Sect.  8.4.  A  more  accurate  linear  approxima¬ 
tion  to  the  relation  between  shear  strength  r  and  normal  stress  o  is  delivered 
by  Coulomb’s  model  r  =  c  +  ko  where  k  =  tan  (p  and  c  [kPa]  is  interpreted 
as  cohesion.  Recompute  the  regression  model  of  Exercise  14  in  Sect.  8.4  with 
nonzero  intercept.  Check  that  the  resulting  cohesion  is  indeed  small  as  compared 
to  the  applied  stresses,  and  compare  the  resulting  friction  angles. 

11.  (Change  point  analysis)  The  consumer  prize  data  from  Example  8.21  suggest 
that  there  might  be  a  change  in  the  slope  of  the  regression  line  around  the  year 
2013,  see  also  Fig.  8.9.  Given  data  (x\ ,  y\ ),...,  (xn ,  yn)  with  ordered  data  points 
x\  <  X2  <  ...  <  xn,  phenomena  of  this  type  can  be  modelled  by  a  piecewise 
linear  regression 

ao  +  a\x,  x  <  x *, 

/3q  +  P\x,  x  >  v*. 


y  = 
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If  the  slopes  a\  and  az  are  different,  v*  is  called  a  change  point.  A  change  point 
can  be  detected  by  fitting  models 


ol 0  T  ol \X( ,  i  —  1 ,  . . . ,  m , 

Po  +  fi\ Xi,  i  =  m  +  1,  . . . ,  n 


and  varying  the  index  m  between  2  and  n  —  1  until  a  two-line  model  with  the 
smallest  total  residual  sum  of  squares  SS^(a 0,  u\)  +  SSr(Po,  /3i)  is  found.  The 
change  point  v*  is  the  point  of  intersection  of  the  two  predicted  lines.  (If  the 
overall  one-line  model  has  the  smallest  SSr,  there  is  no  change  point.) 

Find  out  whether  there  is  a  change  point  in  the  data  of  Example  8.21.  If  so,  locate 
it  and  use  the  two-line  model  to  predict  the  consumer  price  index  for  2017. 

12.  Atmospheric  CO2  concentration  has  been  recorded  at  Mauna  Loa,  Hawai,  since 
1958.  The  yearly  averages  (1959-2008)  in  ppm  can  be  found  in  the  MATLAB 
program  matl8_exl2  .  m;  the  data  are  from  [14]. 

(a)  Fit  an  exponential  model  y  =  ao  eaiX  to  the  data  and  compare  the  prediction 
with  the  actual  data  (2017:  406.53  ppm). 

Hint.  Taking  logarithms  leads  to  the  linear  model  z  =  ft 0  -\-  pix  with  z  = 
logy,  Po  =  logao,  Pi  =  a \.  Estimate  the  coefficients  Po,  Pi  and  compute 
So,  Si  as  well  as  the  prediction  for  y. 

(b)  Fit  a  square  exponential  model  y  =  a 0  eQflx+Q'2X"  to  the  data  and  check 
whether  this  yields  a  better  fit  and  prediction. 


® 
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Differential  Equations 


In  this  chapter  we  discuss  the  theory  of  initial  value  problems  for  ordinary  differential 
equations.  We  limit  ourselves  to  scalar  equations  here;  systems  will  be  discussed  in 
the  next  chapter. 

After  presenting  the  general  definition  of  a  differential  equation  and  the  geometric 
significance  of  its  direction  field,  we  start  with  a  detailed  discussion  of  first-order 
linear  equations.  As  important  applications  we  discuss  the  modelling  of  growth  and 
decay  processes.  Subsequently,  we  investigate  questions  of  existence  and  (local) 
uniqueness  of  the  solution  of  general  differential  equations  and  discuss  the  method 
of  power  series.  We  also  study  the  qualitative  behaviour  of  solutions  close  to  an 
equilibrium  point.  Finally,  we  discuss  the  solution  of  second-order  linear  problems 
with  constant  coefficients. 


1 9.1  Initial  Value  Problems 

Differential  equations  are  equations  involving  a  (sought  after)  function  and  its  deriva¬ 
tive^).  They  play  a  decisive  role  in  modelling  time-dependent  processes. 

Definition  19.1  Let  D  c  Mr  be  open  and  /  :  D  c  M2  — >  R  continuous.  The  equa¬ 
tion 

y'(x)  =  f(x,  y(x)) 

is  called  (an  ordinary)  first-order  differential  equation .  A  solution  is  a  differentiable 
function  y  :  /  — >  D  which  satisfies  the  equation  for  all  x  e  /. 
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One  often  suppresses  the  independent  variable  x  in  the  notation  and  writes  the 
above  problem  for  short  as 

/  =  f(x,  y ). 

The  sought  after  function  y  in  this  equation  is  also  called  the  dependent  variable 
(depending  oni). 

In  modelling  time-dependent  processes,  one  usually  denotes  the  independent  vari¬ 
able  by  t  (for  time)  and  the  dependent  variable  by  v  =  x(t).  In  this  case  one  writes 
the  first-order  differential  equation  as 

x(t)  =  f(t,x(t)) 


or  for  short  as  x  =  f(t,x). 


Example  19.2  (Separation  of  the  variables)  We  want  to  find  all  functions  y  =  y  (x) 
satisfying  the  equation  y'(x)  =  x  •  y(x)2.  In  this  example  one  obtains  the  solutions 
by  separating  the  variables.  For  y  /  0  one  divides  the  differential  equation  by  y2 
and  gets 


“2  •/  =  *• 

The  left-hand  side  of  this  equation  is  of  the  form  g(y)  •  y' .  Let  G(y)  be  an  antideriva¬ 
tive  of  g(y).  According  to  the  chain  rule,  and  recalling  that  y  is  a  function  of  x,  we 
obtain 


dx 


G(y) 


d  y  dx 

In  our  example  we  have  g(y)  =  y~2  and  G(y)  =  —  y~l,  consequently 


d*  \  y)  y2  y 

Integration  of  this  equation  with  respect  to  x  results  in 


1 

y 


+  C, 


where  C  denotes  an  arbitrary  integration  constant.  By  elementary  manipulations  we 
find 


1  _  2 
J  “  -x2/2  -  C  ~  K  -x2 
with  the  constant  K  =  —2 C. 

The  function  y  =  0  is  also  a  solution  of  the  differential  equation.  Formally,  one 
obtains  it  from  the  above  solution  by  setting  K  =  oo.  The  example  shows  that  differ¬ 
ential  equations  have  infinitely  many  solutions  in  general.  By  requiring  an  additional 
condition,  a  unique  solution  can  be  selected.  For  example,  setting  y(0)  =  1  gives 
y(x)  =  2/(2 -x2). 


19.1  Initial  Value  Problems 
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Fig.  19.1  The  direction  field  2 

of  y'  =  —2  xy /{x2  +  2  y) 
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Definition  19.3  The  differential  equation  yf(x)  =  f(pc,y(x))  together  with  the 
additional  condition  y(vo)  =  yo>  be., 

y\x )  =  f(x,  y(x)),  y(xo)  =  y0, 

is  called  initial  value  problem .  A  solution  of  an  initial  value  problem  is  a  (continu¬ 
ously)  differentiable  function  y(v),  which  satisfies  the  differential  equation  and  the 
initial  condition  y(vo)  =  yo- 

Geometric  interpretation  of  a  differential  equation.  For  a  given  first-order  differ¬ 
ential  equation 

y'  =  f(x,y),  (x,y)eDc  R2 

one  searches  for  a  differentiable  function  y  =  y  (x)  whose  graph  lies  in  D  and  whose 
tangents  have  the  slopes  tamp  =  y'{x)  =  f(x,  y(v))  for  each  v.  By  plotting  short 
arrows  with  slopes  tan  cp  =  f(x,  y)  at  the  points  (x,  y)  e  D  one  obtains  the  direction 
field  of  the  differential  equation.  The  direction  field  is  tangential  to  the  solution  curves 
and  offers  a  good  visual  impression  of  their  shapes.  Figure  19.1  shows  the  direction 
field  of  the  differential  equation 


/  2xy 

^  x2  +  2y 

The  right-hand  side  has  singularities  along  the  curve  y  =  —x2/2  which  is  reflected 
by  the  behaviour  of  the  arrows  in  the  lower  part  of  the  figure. 

Experiment  19.4  Visualise  the  direction  field  of  the  above  differential  equation  with 
the  applet  Dynamical  systems  in  the  plane. 
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1 9.2  First-Order  Linear  Differential  Equations 

Let  a(x )  and  g(x)  be  functions  defined  on  some  interval.  The  equation 

y'  +a{x)y  =  g(x) 

is  called  a first-order  linear  differential  equation.  The  function  a  is  the  coefficient ,  the 
right-hand  side  g  is  called  inhomogeneity .  The  differential  equation  is  called  homo¬ 
geneous ,  if  g  =  0,  otherwise  inhomogeneous.  First  we  state  the  following  important 
result. 

Proposition  19.5  (Superposition  principle)  If  y  and  z  are  solutions  of  a  linear 
differential  equation  with  possibly  different  inhomogeneities 

y'(x)  +  a(x)  y(x)  =  g(x), 
z\x)  +  a(x)  z(x)  =  h(x), 

then  their  linear  combination 

w(x)  =  ay(x)  +  /3z(x),  a,  (3  e  M 

solves  the  linear  differential  equation 

w'(x)  +  a(x)  w(x)  =  cxg(x)  +  /3h(x). 

Proof  This  so-called  superposition  principle  follows  from  the  linearity  of  the  deriva¬ 
tive  and  the  linearity  of  the  equation.  □ 

In  a  first  step  we  compute  all  solutions  of  the  homogeneous  equation.  We  will  use 
the  superposition  principle  later  to  find  all  solutions  of  the  inhomogeneous  equation. 

Proposition  19.6  The  general  solution  of  the  homogeneous  differential  equation 

y1  +  a(x)y  =  0 


is 

yh(x)  =  Kc~a(x) 

with  K  E  R  and  an  arbitrary  antiderivative  A(v)  ofa(x). 


Proof  For  y  /  0  we  separate  the  variables 


-  •  y'  =  -a(x) 

y 


1 9.2  First-Order  Linear  Differential  Equations 
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Fig.  19.2  The  direction  field  of  y'  =  y  (left)  and  y'  =  —  y  (right) 


and  use 


d 

d  y 


log  \y\  = 


l 

y 


to  obtain 

log \y\  =  -A(x)  +  C 

by  integrating  the  equation.  From  that  we  infer 


|j(x)|=e  A(x)ec. 


This  formula  shows  that  y(x)  cannot  change  sign  since  the  right-hand  side  is  never 
zero.  Thus  K  =  ec  •  signy(v)  is  a  constant  as  well,  and  the  formula 

y(x)  =  sign y(x)  •  |y(x)|  =  Ke~A(x),  K  e  R 

yields  all  solutions  of  the  homogeneous  equation.  □ 


Example  19.7  The  linear  differential  equation 

x  =  ax 

with  constant  coefficient  a  has  the  general  solution 

x(t)  =  Kea> ,  K  e  R. 

The  constant  K  is  determined  by  v(0),  for  example. 

The  direction  field  of  the  differential  equation  y'  =  ay  (depending  on  the  sign  of 
the  coefficient)  is  shown  in  Fig.  19.2. 

Interpretation.  Let  x(t)  be  a  time-dependent  function  which  describes  a  growth  or 
decay  process  (population  increase/decrease,  change  of  mass,  etc.).  We  consider  a 
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time  interval  [t ,  t  +  h]  with  h  >  0.  For  x(t)  /  0  the  relative  change  of  v  in  this  time 
interval  is  given  by 

x(t  +  h)  —  x(t )  x(t  +  h) 
x(t)  x(t) 

The  relative  rate  of  change  (change  per  unit  of  time)  is  thus 


x(t -\- h)  —  x(t)  1  x(t -\- h)  —  x(t) 
t  +  h  —  t  x(t)  h  •  x(t) 


For  an  ideal  growth  process  this  rate  only  depends  on  time  t.  In  the  limit  h  ^  0  this 
leads  to  the  instantaneous  relative  rate  of  change 


a(t)  =  lim 
h^O 


x(t  +  h)  —  x(t) 
h  •  x(t) 


x(t) 

x(t)' 


Ideal  growth  processes  thus  may  be  modelled  by  the  linear  differential  equation 


x(t)  =  a(t)x(t). 


Example  19.8  (Radioactive  decay)  Let  x(t)  be  the  concentration  of  a  radioactive 
substance  at  time  t .  In  radioactive  decay  the  rate  of  change  does  not  depend  on  time 
and  is  negative, 

a(t)  =  a  <  0. 

The  solution  of  the  equation  x  =  ax  with  initial  value  v  (0)  =  vo  is 

x(t)  =  eatxo. 


It  is  exponentially  decreasing  and  lim^oo  x(t)  =  0,  see  Fig.  19.3.  The  half  life  T , 
the  time  in  which  half  of  the  substance  has  decayed,  is  obtained  from 


*0  aT  r  log  2 

—  =  e  Vo  as  T  = - 


a 


The  half  life  for  a  =  —2  is  indicated  in  Fig.  19.3  by  the  dotted  lines. 


Fig.  19.3  Radioactive  decay 
with  constants 
a  =  —0.5,  —1,-2  (top  to 
bottom) 


281 


1 9.2  First-Order  Linear  Differential  Equations 


Example  19.9  (Population  models)  Let  x(t )  be  the  size  of  a  population  at  time  t, 
modelled  by  x  =  ax.  If  a  constant,  positive  rate  of  growth  a  >  0  is  presumed  then 
the  population  grows  exponentially 

x(t)  =  eatxo,  lim  \x(t)\  =  oo. 

t  — >  oo 

One  calls  this  behaviour  Malthusian  law.  In  1839  Verhulst  suggested  an  improved 
model  which  also  takes  limited  resources  into  account 

x(t)  =  (cr  —  (3x(t))  •  x(t)  with  a,  (3  >  0. 

The  corresponding  discrete  model  was  already  discussed  in  Example  5.3,  where  L 
denoted  the  quotient  a//3. 

The  rate  of  growth  in  Verhulst’ s  model  is  population  dependent,  namely  equal  to 
a  —  f3x(t ),  and  decreases  linearly  with  increasing  population.  Verhulst’s  model  can 
be  solved  by  separating  the  variables  (or  with  maple).  One  obtains 

cr 

x(t)  =  — - 

f3  +  Cae-at 

and  thus,  independently  of  the  initial  value, 

lim  x(t)  =  -, 

t^OO  [3 

see  also  Fig.  19.4.  The  stationary  solution  x(t)  =  a/ (3  is  an  asymptotically  stable 
equilibrium  point  of  Verhulst’s  model,  see  Sect.  19.5. 

Variation  of  constants.  We  now  turn  to  the  solution  of  the  inhomogeneous  equation 

/  +  a(x)y  =  g(x). 

We  already  know  the  general  solution 


yh(x)  =  c  •  e  A(x\  c  e 

Fig.  1 9.4  Population  x 

increase  according  to 
Malthus  and  Verhulst 

a/ (3 
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^.R.  Malthus,  1766-1834. 
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of  the  homogeneous  equation  with  the  antiderivative 

A(x)  =  f  a(0  d£. 

Jxo 

We  look  for  a  particular  solution  of  the  inhomogeneous  equation  of  the  form 

Jp(x)  =  c(x)  •  Jh(x)  =  c(x )  •  e~A(x), 

where  we  allow  the  constant  c  =  c(x)  to  be  a  function  of  v  (variation  of  constants). 
Substituting  this  formula  into  the  inhomogeneous  equation  and  differentiating  using 
the  product  rule  yields 

yp(x)  +  a(x)  yp(x)  =  c'(x )  yh(x)  +  c(x )  y^(x)  +  a(x)  yp(x) 

=  c'(x)  yh(x)  -  a(x )  c(x )  yb(x)  +  a(x)  yp(x) 

=  c'(x)  yh(x). 

If  one  equates  this  expression  with  the  inhomogeneity  g(x ),  one  recognises  that  c(x ) 
fulfils  the  differential  equation 


c(x)  =  eA^g(x) 
which  can  be  solved  by  integration 

c(x)  =  f  eMOg(0  d£. 

Jxo 

We  thus  obtain  the  following  proposition. 

Proposition  19.10  The  differential  equation 

/  +  a(x)y  =  g(x ) 

has  the  general  solution 

y(x)  =  z~A{x)  (£  eM°g(0  dC  + 

a(£)  d^  and  an  arbitrary  constant  K  e  R. 

Proof  By  the  above  considerations,  the  function  y(v)  is  a  solution  of  the  differen¬ 
tial  equation  /  +  a(x)y  =  g{x).  Conversely,  let  z(v)  be  any  other  solution.  Then, 
according  to  the  superposition  principle ,  the  difference  z(x)  —  y(x)  is  a  solution  of 
the  homogeneous  equation,  so 

z(x)  =  y(x)  +  ce_A(x). 


Therefore,  z(x)  also  has  the  form  stated  in  the  proposition. 


□ 
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Corollary  19.11  Let  yp  be  an  arbitrary  solution  of  the  inhomogeneous  linear  dif¬ 
ferential  equation 

y'  +  a(x)y  =  g(x). 

Then,  its  general  solution  can  be  written  as 

y(x)  =  yp(x)  +  yh(x)  =  yp(x)  +  K  e~A(x),  K  e  K. 

Proof  This  statement  follows  from  the  proof  of  Proposition  19.10  or  directly  from 
the  superposition  principle.  □ 

Example  19.12  We  solve  the  problem  yr  +  2y  =  e41  +  1.  The  solution  of  the  homo¬ 
geneous  equation  is  yh(x)  =  c  e~2x .  A  particular  solution  can  be  found  by  variation 
of  constants.  From 

c(x)  =  e2?(e4?  +  1)  d£  =  1  e6x  +  1  e2x  -  j 

it  follows  that 

/  \  1  4X  ^  —2x  1 

y  (x)  =  -e4x - e  +  -. 

4p  6  3  2 

The  general  solution  is  thus 

y(x)  =  yp(x)  +  yh(x)  =  K  e~2x  +  1  e4x  +  1. 

Here,  we  have  combined  the  two  terms  containing  e~2x .  The  new  constant  K  can  be 
determined  from  an  additional  initial  condition  y(0)  =  cr,  namely 

2 

K  =  a - . 

3 


1 9.3  Existence  and  Uniqueness  of  the  Solution 

Finding  analytic  solutions  of  differential  equations  can  be  a  difficult  problem  and 
is  often  impossible.  Apart  from  some  types  of  differential  equations  (e.g.,  linear 
problems  or  equations  with  separable  variables),  there  is  no  general  procedure  to 
determine  the  solution  explicitly.  Thus  numerical  methods  are  used  frequently  (see 
Chap.  21).  In  the  following  we  discuss  the  existence  and  uniqueness  of  solutions  of 
general  initial  value  problems. 
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Proposition  19.13  (Peano’s  theorem2)  If  the  function  f  is  continuous  in  a  neigh¬ 
bourhood  of(x o,  yo)>  then  the  initial  value  problem 

/  =  f(x,  y),  y(xo)  =  jo 
has  a  solution  y(x)  for  x  close  to  xq. 


Instead  of  a  proof  (see  [11,  Part  I,  Theorem 7.6]),  we  discuss  the  limitations  of 
this  proposition.  First  it  only  guarantees  the  existence  of  a  local  solution  in  the 
neighbourhood  of  the  initial  value.  The  next  example  shows  that  one  cannot  expect 
more,  in  general. 


Example  19.14  We  solve  the  differential  equation  x  =  x2,  x(0)  =  1.  Separation  of 
the  variables  yields 


and  thus 


d  t  —  t  T  (7, 


1 

x(t)  =  - . 

V  1  -t 

This  function  has  a  singularity  at  t  =  1,  where  the  solution  ceases  to  exist.  This 
behaviour  is  called  blow  up. 


Furthermore,  Peano’s  theorem  does  not  give  any  information  on  how  many  solu¬ 
tions  an  initial  value  problem  has.  In  general,  solutions  need  not  be  unique,  as  it  is 
shown  in  the  following  example. 

Example  19.15  The  initial  value  problem  y 7  =  2^/\y\,  y(0)  =  0  has  infinitely  many 
solutions 


b  <  x, 

— a  <  x  <b ,  a,b  >  0  arbitrary, 

v  <  —a, 

For  example,  for  x  <  —a,  one  verifies  at  once 

yfx)  =  —2(x  —  a)  =  2(a  —  x)  =  2\x  —  a\  =  2xJ  (x  —  a)2  =  2^/jyj. 

Thus  the  continuity  of  /  is  not  sufficient  to  guarantee  the  uniqueness  of  the  solution 
of  initial  value  problems.  One  needs  somewhat  more  regularity,  namely  Lipschitz3 
continuity  with  respect  to  the  second  variable  (see  also  Definition  C.14). 


[  C x-b )2, 
y(x)  =  I  o, 

[  -(x  -  a)2, 


2G.  Peano,  1858-1932. 

'R.  Lipschitz,  1832-1903. 
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Definition  19.16  Let  Del2  and  /  :  D  — >►  M.  The  function  /  is  said  to  satisfy 
a  Lipschitz  condition  with  Lipschitz  constant  L  on  D,  if  the  inequality  |  /(v,  y)  — 
f{x ,  z)|  <  L  \y  —  z\  holds  for  all  points  (v,  y),  (v,  z)  6  D. 


According  to  the  mean  value  theorem  (Proposition  8.4) 

fix,  y )  -  /(*,  z)  =  fix,  Oiy  -  z) 

dy 

for  every  differentiable  function.  If  the  derivative  is  bounded,  then  the  function  sat¬ 
isfies  a  Lipschitz  condition.  In  this  case  one  can  choose 


Counterexample  19.17  The  function  g(x ,  y)  =  +J\y f  does  not  satisfy  a  Lipschitz 
condition  in  any  D  that  contains  a  point  with  y  =  0  because 


\g(x,y)  —  gjx,  0)| 
\y-o\ 


for 


0. 


Proposition  19.18  If  the  function  f  satisfies  a  Lipschitz  condition  in  the  neighbour¬ 
hood  of(x o,  yo).  then  the  initial  value  problem 

y'  —  fix,  y),  y(xo)  =  yo 


has  a  unique  solution  y(x)  for  x  close  to  xq. 


Proof  We  only  show  uniqueness,  the  existence  of  a  solution  y(x)  on  the  interval 
[vo,  vo  +  H ]  follows  (for  small  H)  from  Peano’s  theorem.  Uniqueness  is  proven 
indirectly.  Assume  that  z  is  another  solution,  different  from  y,  on  the  interval 
[vo,  vo  +  H]  with  z(vo)  =  yo-  The  number 

vi  =  inf  {v  el;  vo  <  v  <  vo  +  H  and  y(v)  f  ^(x)} 

is  thus  well-defined.  We  infer  from  the  continuity  of  y  and  z  that  y(vi)  =  z(x i). 
Now  we  choose  h  >  0  so  small  that  x\  +  h  <  vo  +  H  and  integrate  the  differential 
equation 

y'ix)  =  f(x,yix)) 

from  vi  to  x\  -\-h.  This  gives 

rx\+h  rx\+h 

yix\  +  h)  -  y{x\)  =  /  y'(x)  dx  =  /  f(x,y(x))  dx 

Jx  i  Jx  1 
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and 

rxi+h 

z{x\  +h)  -  y{x\)  =  /  f(x,z(x))  dx 

J  X 1 

Subtracting  the  first  formula  above  from  the  second  yields 


z(x  i  +h)  —  y(x  i  +  h)  = 


rx\+h 
J  x  i 


*,zOO)  - /(x,  y(x)))  dx. 


The  Lipschitz  condition  on  /  gives 


«Xl+/z 


r 

\z(xi  +h)  -y(x\  +h)\  <  /  \f(x,z(x))-f(x,y(x))\dx 

J  X 1 

/•jci+Zi 

<L  /  |z(v)  —  y(v)|  dx. 


Let  now 


M  =  max{|z(x)  —  y(x)\ ;  xi  <  x  <  xi  +  /z } . 

Due  to  the  continuity  of  y  and  z,  this  maximum  exists,  see  the  discussion  after 
Proposition  6.15.  After  possibly  decreasing  h  this  maximum  is  attained  at  x\  +  h 
and 


M  =  |z(xi  +  h) 


y(x i  +h)\  <  L 


rx\+h 

J  * 

J  x  i 


Mdx  <  LhM. 


For  a  sufficiently  small  h,  namely  Lh  <  1,  the  inequality 


M  <  LhM 


implies  M  =  0.  Since  one  can  choose  h  arbitrarily  small,  y(x)  =  z(x)  holds  true  for 
xi<x<xi+/zin  contradiction  to  the  definition  of  x\ .  Hence  the  assumed  different 
solution  z  does  not  exist.  □ 


1 9.4  Method  of  Power  Series 

We  have  encountered  several  examples  of  functions  that  can  be  represented  as  series, 
e.g.  in  Chap.  12.  Motivated  by  this  we  try  to  solve  the  initial  value  problem 

/  =  / (x,  y),  y(x0)  =  yo 


oo 

y(x)  =  -  xo)n. 

n—0 


by  means  of  a  series 
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We  will  use  the  fact  that  convergent  power  series  can  be  differentiated  and  rearranged 
term  by  term,  see  for  instance  [3,  Chap.  9,  Corollary  7.4]. 

Example  19.19  We  solve  once  more  the  linear  initial  value  problem 

y'  =  y,  y(0)  =  l. 


For  that  we  differentiate  the  ansatz 

OO 

y  (v)  =  anxn  =  ao  +  a\x  +  a^x1  +  a 3V3  +  •  •  • 

/?=() 

term  by  term  with  respect  to  v 

OO 

y'(x)  =  nanxn~l  =  a\  +  2a2X  +  3a^x2  +  Aa^x2  +  •  •  • 

n  —  1 

and  substitute  the  result  into  the  differential  equation  to  get 

2  s  22 

a\  T  2a2X  3a^x  Aa^x  T  •  •  •  —  ao  A~  a\x  T  a2X  T  a^x  H-  •  •  • 

Since  this  equation  has  to  hold  for  all  x,  the  unknowns  an  can  be  determined  by 
equating  the  coefficients  of  same  powers  of  v.  This  gives 

a\  =  ao,  2a2  —  a\ , 

3^3  =  a2,  4^4  =  $3 , 


and  so  on.  Due  to  ao  =  y(0)  =  1  this  infinite  system  of  equations  can  be  solved 
recursively.  One  obtains 

1  1  1 

ao  =  1,  a\  =  1,  a2  =  —,  <23  =  — ,  ...  ,  an  =  — 

2!  3!  n\ 

and  thus  the  (expected)  solution 


OO 


n 


//— 0 


EX 

-7 
w 


Example  19.20  (A  particular  Riccati  differential  equation4)  For  the  solution  of  the 
initial  value  problem 


4J.F.  Riccati,  1676-1754. 


y'  =  y2  +  x2, 


y(  0)  =  1, 
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we  make  the  ansatz 

oo 

y(x )  =  anxn  =  ao  T  air  T  (22-V2  T  a 3V3  T  •  •  • 

n=0 

The  initial  condition  y(0)  =  1  immediately  gives  <20  =  1.  First,  we  compute  the 
product  (see  also  Proposition  C.  10) 

2  222 

y(x)  =  (1  +  a\x  T  (22X  T  a^x  +•••) 

—  lT  2 ci\x  T  ((22  T  2(22)v2  T  (2a3  T  2(22(2i)v3  T  *  *  * 

and  substitute  it  into  the  differential  equation 

a\  T  2<22V  T  3(232c  T  4(24*  T  •  •  • 

—  lT  2(2l3C  T  (1  T  (22  T  2(22)v2  T  (2(23  T  2(22(2l)v3  T  •  •  • 

Equating  coefficients  results  in 

a\  =  1, 

2(22  =  2(21,  (22  —  1 

3(23  =  1  T  (22  T  2(22,  (23  =  4/3 

4(24  =  2(23  T  2(22(21,  (24  =  7/6, 

Thus  we  obtain  a  good  approximation  to  the  solution  for  small  v 

4  7 

y(v)  =  1  T  v  T  x2  H — v3  H — v4  T  (9(v5). 

3  6 

The  maple  command 

dsolve ( (dif f (y (x) , x) =x"2+y (x) "2 ,  y(0)=l}/  y(x),  series); 
carries  out  the  above  computations. 


1 9.5  Qualitative  Theory 

Often  one  can  describe  the  qualitative  behaviour  of  the  solutions  of  differential  equa¬ 
tions  without  solving  the  equations  themselves.  As  the  simplest  case  we  discuss  the 
stability  of  nonlinear  differential  equations  in  the  neighbourhood  of  an  equilibrium 
point.  A  differential  equation  is  called  autonomous ,  if  its  right-hand  side  does  not 
explicitly  depend  on  the  independent  variable. 
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Definition  19.21  The  point  y*  e  R  is  called  an  equilibrium  of  the  autonomous  dif¬ 
ferential  equation  yr  =  f(y ),  if  /(y*)  =  0. 

Equilibrium  points  are  particular  solutions  of  the  differential  equation,  so-called 
stationary  solutions. 

In  order  to  investigate  solutions  in  the  neighbourhood  of  an  equilibrium  point,  we 
linearise  the  differential  equation  at  the  equilibrium.  Let 

w(x)  =  y(x)  -  y* 

denote  the  distance  of  the  solution  y(x)  from  the  equilibrium.  Taylor  series  expansion 
of  /  shows  that 


w'  =  y'  =  f(y)  =  f(y)  -  f(y*)  =  f(y*)w  +  0(w2), 


hence 

w'(x)  =  (a  +  0(w))w 

with  a  =  f'{y *).  It  is  decisive  how  solutions  of  this  problem  behave  for  small 
w.  Obviously  the  value  of  the  coefficient  a  +  0(w)  is  crucial.  If  a  <  0,  then 
a  +  0(w)  <  0  for  sufficiently  small  w  and  the  function  \w(x)\  decreases.  If  on 
the  other  hand  a  >  0,  then  the  function  \w(x)\  increases  for  small  w.  With  these 
considerations  one  has  proven  the  following  proposition. 

Proposition  19  .22  Let  y*  be  an  equilibrium  point  of  the  differential  equation  y'  = 
f(y)  and  assume  that  f'(y *)  <  0.  Then  all  solutions  of  the  differential  equation  with 
initial  value  w  (0)  close  to  y*  satisfy  the  estimate 

\w(x)\  <C  -ebx  ■  |u>(0)| 

with  constants  C  >  0  and  b  <  0. 

Under  the  conditions  of  the  proposition  one  calls  the  equilibrium  point  asymptot¬ 
ically  stable.  An  asymptotically  stable  equilibrium  attracts  all  solutions  in  a  suffi¬ 
ciently  small  neighbourhood  (exponentially  fast),  since  due  to  b  <  0 

\w(x)\  -x  0  as  v  — >  oo. 

Example  19.23  Verhulst’s  model 

y'  =  (a  -  (3y)y,  a,  0  >  0 

has  two  equilibrium  points,  namely  y*  =  0  and  y^=  a/ (3.  Due  to 

f'{y\)  =  a-  2  /3y*  =  a,  f(y p  =  a  -  2fiy\  =  -a, 


y*  =  0  is  unstable  and  y|  =  a//3  is  asymptotically  stable. 
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1 9.6  Second-Order  Problems 

The  equation 

y"  (x)  +  ay\x)  +  by(x)  =  g(x) 

is  called  a  second-order  linear  differential  equation  with  constant  coefficients  a ,  b 
and  inhomogeneity  g. 

Example  19.24  (Mass-spring-damper  model)  According  to  Newton’s  second  law 
of  motion,  a  mass-spring  system  is  modelled  by  the  second-order  differential  equa¬ 
tion 

y"(x)  +  ky(x)  =  0, 

where  y(x)  denotes  the  position  of  the  mass  and  k  is  the  stiffness  of  the  spring.  The 
solution  of  this  equation  describes  a  free  vibration  without  damping  and  excitation. 
A  more  realistic  model  is  obtained  by  adding  a  viscous  damping  force  —cy'(x)  and 
an  external  excitation  g(x).  This  results  in  the  differential  equation 

my'\x)  +  cy\x)  +  ky(x)  =  g(x), 

which  is  of  the  above  form. 

By  introducing  the  new  variable  z(x)  =  y'(x),  the  homogeneous  problem 


y'  +  ay'  +  by  =  0 


can  be  rewritten  as  a  system  of  first-order  equations 

/  =  z 

z  =  —by  -  az , 

see  Chap.  20,  where  this  approach  is  worked  out  in  detail. 

Here,  we  will  follow  a  different  idea.  Let  a  and  [3  denote  the  roots  of  the  quadratic 
equation 

A2  T  a  A  T  b  —  0, 

which  is  called  the  characteristic  equation  of  the  homogeneous  problem.  Then,  the 
second-order  problem 


y"(x)  +  ay' {x)  +  by(x)  =  g(x) 
can  be  factorised  in  the  following  way: 


y(x)  =  g(x). 
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Setting 


U)(x)  =  y\x)  -  ay(x), 

we  obtain  the  following  first-order  linear  differential  equation  for  w 

w'(x )  —  /3w(x )  =  g(x). 

This  problem  has  the  general  solution  (see  Proposition  19.10) 

w(x)  =  K2^(x~xo)  +  r  tP(x~°9(0  d£ 

Jx  0 

with  some  constant  K 2.  Inserting  this  expression  into  the  definition  of  w  shows  that 
y  is  the  solution  of  the  first-order  problem 

y'(x)  —  ay{x)  =  K2^(yX~x^  +  f  Q^x~®g(0  d£. 

Jx  0 

Let  us  assume  for  a  moment  that  a  7^  (3.  Applying  once  more  Proposition  19.10  for 
the  solution  of  this  problem  gives 


Since 


_  v'^ol{x-xq) 


y(x)  =  K\Q 


+ 


f  Qa(yX  ^  1/1(77)  dry 
Jx  0 


1. 


/ 

Jx  0 


=  Kiea(x~xo)  +  K2  1  ea(x~v)el3(v-xo)  dj? 


X 


+  /  e 

Ao 


a(x—r] ) 


[  el3(:n  °g(0  d£  drj. 
Jx  0 


px  px 

/  Qa(x-rj)Qp(r}-xo)  ^  _  Qax-(3x 0  / 

Jx  0  Jxq 


dr] 


1 


f3  —  a 


q(3(x-xq)  _  Qa(x-x 0) 


we  finally  obtain 


y(jc)  =  cieo(*-,t)  +  c2^(x~xo)  +  r  T  eP^giO  d£  dr? 

«/^0  Jxq 


with 


ci  = 


^2 

f3  —  a 


ci  = 


^2 

f3  —  a 
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By  setting  g  =  0,  one  obtains  the  general  solution  of  the  homogeneous  problem 

yh(x)  =  de“(l-I#)  +  c2eP(x-xo). 


The  double  integral 


Qot(x-rj) 


rn 

Jxo 


°g(  0  d£d  n 


is  a  particular  solution  of  the  inhomogeneous  problem.  Note  that,  due  to  the  linearity 
of  the  problem,  the  superposition  principle  (see  Proposition  19.5)  is  again  valid. 
Summarising  the  above  calculations  gives  the  following  two  propositions. 


Proposition  19.25  Consider  the  homogeneous  differential  equation 

/'(■*)  +  ay'  (v)  +  by(x)  =  0 

and  let  a  and  f3  denote  the  roots  of  its  characteristic  equation 

A2  T  aX  T  b  —  0. 

The  general  (real)  solution  of  this  problem  is  given  by 

c\Qax  +  C2^X  for  a  7^  (3  e  R, 

(i c\  +  C2x)eax  for  a  =  (3  e  R, 

Qpx(c\  cos(6x)  +  C2  sin(0x))  fora  =  p  +  i6,  p,  6  e  R, 

for  arbitrary  real  constants  c\  and  q. 


Proof  Since  the  characteristic  equation  has  real  coefficients,  the  roots  are  either  both 
real  or  conjugate  complex,  i.e.  a  =  [3.  The  case  a  /  f3  was  already  considered  above. 
In  the  complex  case  where  a  =  p  +  iO,  we  use  Euler’s  formula 

eax  =  qPx  (cos (Ox)  +  i  sin(6br)). 

This  shows  that  c\epx  cos  (Ox)  and  C2^px  sin(0jt)  are  the  searched  for  real  solutions. 
Finally,  in  the  case  a  =  [3,  the  above  calculations  show 

jhM  =  Kxea{x-xo)  +  K2  T  ea(x-v)eu(v-x0)  dj? 

Jxo 

=  (c\  +  C2X)zaX 


with  c\  =  (K i  —  K2Xq)q  ax°  and  C2  =  A^e  ax° 


□ 
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Proposition  19.26  Let  yp  be  an  arbitrary  solution  of  the  inhomogeneous  differential 
equation 

y'\x)  +  ay'(x)  +  by(x )  =  g(x). 

Then  its  general  solution  can  be  written  as 

y(x)  =  yh(x)  +  yp(x) 

where  yh  is  the  general  solution  of  the  homogeneous  problem. 

Proof  Superposition  principle.  □ 

Example  19.27  In  order  to  find  the  general  solution  of  the  inhomogeneous  differen¬ 
tial  equation 

y"(x)-4y(x)  =  ex 

we  first  consider  the  homogeneous  part.  Its  characteristic  equation  A2  —  4  =  0  has 
the  roots  Ai  =  2  and  A2  =  —2.  Therefore, 

yh(-x)  =  Cie2x  +  C2e~2x. 

A  particular  solution  of  the  inhomogeneous  problem  is  found  by  the  general  formula 

yp(v)  =  (  Q2^x~rb  f  e-2b7-0ef  d£dr] 

Jo  Jo 

=  e2x  j  Q~4r>  -  (oJ71  —  l)  dp 
=  ((!  -  e-')  +  I  (e-'  -  1))  . 

Comparing  this  with  yh  shows  that  the  choice  yp(x)  =  —  ye1  is  possible  as  well, 
since  the  other  terms  solve  the  homogeneous  equation. 

In  general,  however,  it  is  simpler  to  use  as  ansatz  for  yp  a  linear  combination 
of  the  inhomogeneity  and  its  derivatives.  In  our  case,  the  ansatz  would  be  yp(v)  = 
atx .  Inserting  this  ansatz  into  the  inhomogeneous  problem  gives  a  —  4a  =  1,  which 
results  again  in  yp(v)  =  —ye1. 

Example  19.28  The  characteristic  equation  of  the  homogeneous  problem 

y//(v)-10y/(v)  +  25y(v)  =  0 

has  the  double  root  Ai  =  A2  =  5.  Therefore,  its  general  solution  is 

y(x)  =  cie5x  +  C2XQ5x . 
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Example  19.29  The  characteristic  equation  of  the  homogeneous  problem 

/'(*)  + 2/(*)  +  2y(*)=0 

has  the  complex  conjugate  roots  Ai  =  —  1  +  i  and  A2  =  —  1  —  i.  The  complex  form 
of  its  general  solution  is 

y(x)  =  Cle-(1+i)*  +  c2e-(1-i)x 

with  complex  coefficients  c\  and  C2- 
The  real  form  is 

y(x)  =  e-A  (d\  cos  x  +  d2  sin  v) 
with  real  coefficients  d\  and 


19.7  Exercises 

1.  Find  the  general  solution  of  the  following  differential  equations  and  sketch  some 
solution  curves 

(a)  x  =  — ,  (b)  x  =  — ,  (c)  x  =  — . 

t  x  x 

The  direction  field  is  most  easily  plotted  with  maple,  e.g.  with  DEplot. 

2.  Using  the  applet  Dynamical  systems  in  the  plane ,  solve  Exercise  1  by  rewrit¬ 
ing  the  respective  differential  equation  as  an  equivalent  autonomous  system  by 
adding  the  equation  i  =  1 . 

Hint.  The  variables  are  denoted  by  v  and  y  in  the  applet.  For  example,  Exer¬ 
cise  1(a)  would  have  to  be  written  as  x'  =  x/y  and  y'  =  1. 

3.  According  to  Newton’s  law  of  cooling,  the  rate  of  change  of  the  temperature  v 
of  an  object  is  proportional  to  the  difference  of  its  temperature  and  the  ambient 
temperature  a.  This  is  modelled  by  the  differential  equation 

x  =  k(a  —  x), 

where  k  is  a  proportionality  constant.  Find  the  general  solution  of  this  differential 
equation. 

How  long  does  it  take  to  cool  down  an  object  from  v(0)  =  100°  to  40°  at  an 
ambient  temperature  of  20°,  if  it  cooled  down  from  100°  to  80°  in  5  minutes? 

4.  Solve  Verhulst’s  differential  equation  from  Example  19.9  and  compute  the  limit 
t  — >  00  of  the  solution. 
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5.  A  tank  contains  100/  of  liquid  A.  Liquid  B  is  added  at  a  rate  of  5 l /s,  while  at 
the  same  time  the  mixture  is  pumped  out  with  a  rate  of  10 l /s.  We  are  interested 
in  the  amount  x(t)  of  the  liquid  B  in  the  tank  at  time  t.  From  the  balance 
equation  x(t)  =  rate(in)  —  rate(out)  =  rate(in)  —  10  •  x(t)/ total  amount (t)  one 
obtains  the  differential  equation 


.  c  10* 

x  =  5 - 

100  -  5 1 


*(0)  =  0. 


Explain  the  derivation  of  this  equation  in  detail  and  use  maple  (with  dsolve) 
to  solve  the  initial  value  problem.  When  is  the  tank  empty? 

6.  Solve  the  differential  equations 

(a)  y'  =  ay,  (b)  y'  =  ay  +  2 


with  the  method  of  power  series. 

7.  Find  the  solution  of  the  initial  value  problem 

x{t)  =  1  +  x(t)2 

with  initial  value  v(0)  =  0.  In  which  interval  does  the  solution  exist? 

8.  Find  the  solution  of  the  initial  value  problem 

x(t)  +  2  x(t)  =  e4f  +  1 

with  initial  value  x(0)  =  0. 

9.  Find  the  general  solutions  of  the  differential  equations 

(a)  x  +  4x  —  5x  =  0,  (b)  *  +  4x  +  5x  =  0,  (c)  *  +  4x  =  0. 

10.  Find  a  particular  solution  of  the  problem 

x(t)  +  x(t)  —  6x(t)  =  t2  +  2t  —  1. 

Hint.  Use  the  ansatz  yp(0  =  at2  +  bt  +  c. 

11.  Find  the  general  solution  of  the  differential  equation 

y!r  (x)  +  4  y(x)  =  cos  v 

and  specify  the  solution  for  the  initial  data  y(0)  =  1,  /(0)  =  0. 

Hint.  Consider  the  ansatz  yp(v)  =  k\  cos  v  +  k 2  sin  v. 

12.  Find  the  general  solution  of  the  differential  equation 

y"(x)  +  4  y\x)  +  5y(v)  =  cos2v 

and  specify  the  solution  for  the  initial  data  y(0)  =  1,  /(0)  =  0. 

Hint.  Consider  the  ansatz  yp(v)  =  k\  cos  2x  +  k 2  sin  2x. 

13.  Find  the  general  solution  of  the  homogeneous  equation 


y\x)  +  2y\x)  +  y(x)  =  0. 


® 
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Systems  of  Differential  Equations 


Systems  of  differential  equations,  often  called  differentiable  dynamical  systems,  play 
a  vital  role  in  modelling  time-dependent  processes  in  mechanics,  meteorology,  biol¬ 
ogy,  medicine,  economics  and  other  sciences.  We  limit  ourselves  to  two-dimensional 
systems,  whose  solutions  (trajectories)  can  be  graphically  represented  as  curves  in 
the  plane.  The  first  section  introduces  linear  systems,  which  can  be  solved  ana¬ 
lytically  as  will  be  shown.  In  many  applications,  however,  nonlinear  systems  are 
required.  In  general,  their  solution  cannot  be  given  explicitly.  Here  it  is  of  primary 
interest  to  understand  the  qualitative  behaviour  of  solutions.  In  the  second  section 
of  this  chapter,  we  touch  upon  the  rich  qualitative  theory  of  dynamical  systems.  The 
third  section  is  devoted  to  analysing  the  mathematical  pendulum  in  various  ways. 
Numerical  methods  will  be  discussed  in  Chap.  21. 


20.1  Systems  of  Linear  Differential  Equations 

We  start  with  the  description  of  various  situations  which  lead  to  systems  of  differential 
equations.  In  Chap.  19  Malthus’  population  model  was  presented,  where  the  rate  of 
change  of  a  population  v  (t)  was  assumed  proportional  to  the  existing  population: 

x(t )  =  ax(t). 

The  presence  of  a  second  population  y(t)  could  result  in  a  decrease  or  increase  of 
the  rate  of  change  of  x(t).  Conversely,  the  population  x(t)  could  also  affect  the  rate 
of  change  of  y(t).  This  results  in  a  coupled  system  of  equations 

x(t)  =  ax{t)  +  by(t), 
y(0  =  cx(t)  +  dy(t), 
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with  positive  or  negative  coefficients  b  and  c,  which  describe  the  interaction  of  the 
populations.  This  is  the  general  form  of  a  linear  system  of  differential  equations  in 
two  unknowns,  written  for  short  as 

x  =  ax  +  by , 
y  =  cx  +  dy. 

Refined  models  are  obtained,  if  one  takes  into  account  the  dependence  of  the  rate  of 
growth  on  food  supply,  for  instance.  For  one  species  this  would  result  in  an  equation 
of  the  form 


x  =  (v  —  n)x , 

where  v  denotes  the  available  food  supply  and  n  a  threshold  value.  So,  the  population 
is  increasing  if  the  available  quantity  of  food  is  larger  than  n ,  and  is  otherwise 
decreasing.  In  the  case  of  a  predator-prey  relationship  of  species  v  to  species  y,  in 
which  y  is  the  food  for  x,  the  relative  rates  of  change  are  not  constant.  A  common 
assumption  is  that  these  rates  contain  a  term  that  depends  linearly  on  the  other  species. 
Under  this  assumption,  one  obtains  the  nonlinear  system 

x  =  (ay  —  n)x, 
y  =  (d  -  cx)y. 

This  is  the  famous  predator-prey  model  of  Lotka  and  Vol  terra  (for  a  detailed 
derivation  we  refer  to  [13,  Chap.  12.2]). 

The  general  form  of  a  system  of  nonlinear  differential  equations  is 

x  =  f{x,  y), 

y  =  g(x,  y). 


Geometrically  this  can  be  interpreted  in  the  following  way.  The  right-hand  side 
defines  a  vector  field 


(x,  y) 


f(x,y) 
g(x,y ) 


on  R* 2;  the  left-hand  side  is  the  velocity  vector  of  a  plane  curve 


t  i-> 


x{t) 

y(t ) 


The  solutions  are  thus  plane  curves  whose  velocity  vectors  are  given  by  the  vector 
field. 


'A.J.  Lotka,  1880-1949 

2V.  Volterra,  1860-1940 
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Fig.  20.1  Vector  field  and  solution  curves 

Example  20.1  (Rotation  of  the  plane)  The  vector  field 


(x,  y ) 


■y 


X 


is  perpendicular  to  the  corresponding  position  vectors  [x,  y]J ,  see  Fig.  20.1.  The 
solutions  of  the  system  of  differential  equations 


x  =  -y, 

y  =  x 


are  the  circles  (Fig.  20.1) 


x(t)  =  Rcost, 
y(t)  =  R  sin  t, 

where  the  radius  R  is  given  by  the  initial  values,  for  instance,  x  (0)  =  R  and  y  (0)  =  0. 


Remark  20.2  The  geometrical,  two-dimensional  representation  is  made  possible  by 
the  fact  that  the  right-hand  side  of  the  system  does  not  dependent  on  time  t  explicitly. 
Such  systems  are  called  autonomous.  A  representation  which  includes  the  time  axis 
(like  in  Chap.  19)  would  require  a  three-dimensional  plot  with  a  three-dimensional 
direction  field 


(*,  y,  t)  i  ^ 


fix,  y) 

g(x,  y) 

1 


The  solutions  are  represented  as  spatial  curves 


x(t) 

y(0 

t 


t  i-^ 
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Fig.  20.2  Direction  field  and 
space-time  diagram  for 

x  =  -y,y  =  x 


see  the  space-time  diagram  in  Fig.  20.2. 


Example  20.3  Another  type  of  example  which  demonstrates  the  meaning  of  the 
vector  field  and  the  solution  curves  is  obtained  from  the  flow  of  ideal  fluids.  For 
example, 


2  2 
.  x-  -  yz 

X~  ( x2  +  y 2)2  ’ 

.  _  -2xy 

y  (x2  +  y2)2 

describes  a  plane,  stationary  potential  flow  around  the  cylinder  x2  +  y2  <  l  (Fig. 
20.3).  The  right-hand  side  describes  the  flow  velocity  at  the  point  (x ,  y).  The  solution 
curves  follow  the  streamlines 


y 


x2  +  y 


Fig.  20.3  Plane  potential 
flow  around  a  cylinder 
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Here  C  denotes  a  constant.  This  can  be  checked  by  differentiating  the  above  relation 
with  respect  to  t  and  substituting  x  and  y  by  the  right-hand  side  of  the  differential 
equation. 

Experiment  20.4  Using  the  applet  Dynamical  systems  in  the  plane ,  study  the  vec¬ 
tor  field  and  the  solution  curves  of  the  system  of  differential  equations  from  Exam¬ 
ples  20.1  and  20.3.  In  a  similar  way,  study  the  systems  of  differential  equations 

x  =  y,  x  =  y,  x  =  -y,  x=x,  x  =  y, 

y  =  -x,  y  =  x,  y  =  -x,  y  =  x,  y  =  y 

and  try  to  understand  the  behaviour  of  the  solution  curves. 

Before  turning  to  the  solution  theory  of  planar  linear  systems  of  differential  equa¬ 
tions,  it  is  useful  to  introduce  a  couple  of  notions  that  serve  to  describe  the  qualitative 
behaviour  of  solution  curves.  The  system  of  differential  equations 

x(t)  =  f(x{t),  y(tj), 
y(t)  =  g(x(t),  y(t)) 

together  with  prescribed  values  at  t  =  0 


*(0)=*0,  j(0)  =  yo, 

is  again  called  an  initial  value  problem.  In  this  chapter  we  assume  the  functions 
/  and  g  to  be  at  least  continuous.  By  a  solution  curve  or  a  trajectory  we  mean 
a  continuously  differentiable  curve  t  i->  [x(t)  y(t)]J  whose  components  fulfil  the 
system  of  differential  equations. 

For  the  case  of  a  single  differential  equation  the  notion  of  an  equilibrium  point 
was  introduced  in  Definition  19.21.  For  systems  of  differential  equations  one  has  an 
analogous  notion. 

Definition  20.5  (Equilibrium  point)  A  point  (x*,y*)  is  called  equilibrium  point  or 
equilibrium  of  the  system  of  differential  equations,  if  f(x*,y*)  =  0  and 
<?(**,  y*)  =  0. 

The  name  comes  from  the  fact  that  a  solution  with  initial  value  vo  =  x*,  yo  =  y* 
remains  at  (x*,  y*)  for  all  times;  in  other  words,  if  (jc*,  y*)  is  an  equilibrium  point, 
then  x(t)  =  x*,y(t)  =  y*  is  a  solution  to  the  system  of  differential  equations  since 
both  the  left-  and  right-hand  sides  will  be  zero. 

From  Chap.  19  we  know  that  solutions  of  differential  equations  do  not  have  to 
exist  for  large  times.  However,  if  solutions  with  initial  values  in  a  neighbourhood  of 
an  equilibrium  point  exist  for  all  times  then  the  following  notions  are  meaningful. 
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Definition  20.6  Let  (x*,y*)  be  an  equilibrium  point.  If  there  is  a  neighbourhood  U 
of  (x*,  y*)  so  that  all  trajectories  with  initial  values  (xq,  yo)  in  U  converge  to  the 
equilibrium  point  (v*,  y*)  as  t  >  oo,  then  this  equilibrium  is  called  asymptotically 
stable.  If  for  every  neighbourhood  V  of  (x*,  y*)  there  is  a  neighbourhood  W  of 
(v*,  y*)  so  that  all  trajectories  with  initial  values  (xq,  yo)  in  W  stay  entirely  in  V, 
then  the  equilibrium  (x*,  y*)  is  called  stable.  An  equilibrium  point  which  is  not 
stable  is  called  unstable. 

In  short,  stability  means  that  trajectories  that  start  close  to  the  equilibrium  point 
remain  close  to  it;  asymptotic  stability  means  that  the  trajectories  are  attracted  by  the 
equilibrium  point.  In  the  case  of  an  unstable  equilibrium  point  there  are  trajectories 
that  move  away  from  it;  in  linear  systems  these  trajectories  are  unbounded,  and  in  the 
nonlinear  case  they  can  also  converge  to  another  equilibrium  or  a  periodic  solution 
(for  instance,  see  the  discussion  of  the  mathematical  pendulum  in  Sect.  20.3  or  [13]). 

In  the  following  we  determine  the  solution  to  the  initial  value  problem 

x  =  ax  +  by ,  v(0)  =  xo, 

y  =  cx+dy,  y(0)  =  y0- 

This  is  a  two-dimensional  system  of  first-order  linear  differential  equations.  For  this 
purpose  we  first  discuss  the  three  basic  types  of  such  systems  and  then  show  how 
arbitrary  systems  can  be  transformed  to  a  system  of  basic  type. 

We  denote  the  coefficient  matrix  by 

.  T  a  b 
A  = 

c  d 

The  decisive  question  is  whether  A  is  similar  to  a  matrix  of  type  I,  II  or  III,  as 
described  in  Appendix  B.2.  A  matrix  of  type  I  has  real  eigenvalues  and  is  similar  to  a 
diagonal  matrix.  A  matrix  of  type  II  has  a  double  real  eigenvalue;  its  canonical  form, 
however,  contains  a  nilpotent  part.  The  case  of  two  complex  conjugate  eigenvalues 
is  finally  covered  by  type  III. 

Type  I — real  eigenvalues,  diagonalisable  matrix.  In  this  case  the  standard  form  of 
the  system  is 


x  =  ax,  v(0)  =  vo, 

y  =  0y,  y(0)  =  yo- 

We  know  from  Example  19.7  that  the  solutions  are  given  by 

x(t)  =  X0eal,  y(t)  =  yo<zl3t 

and  in  particular  exist  for  all  times  t  e  R.  Obviously  (**,  y*)  =  (0,  0)  is  an  equilib¬ 
rium  point.  If  a  <  0  and  (3  <  0,  then  all  solution  curves  approach  the  equilibrium 
(0,  0)  as  t  oo ;  this  equilibrium  is  asymptotically  stable.  If  a  >  0,  /3  >  0  (not  both 


303 


20.1  Systems  of  Linear  Differential  Equations 


Fig.  20.4  Real  eigenvalues, 
unstable  equilibrium 
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equal  to  zero),  then  the  solution  curves  leave  every  neighbourhood  of  (0,  0)  and  the 
equilibrium  is  unstable.  Similarly,  instability  is  present  in  the  case  where  a  >  0, 
(5  <  0  (or  vice  versa).  One  calls  such  an  equilibrium  a  saddle  point. 

If  cc/0  and  xo  /  0,  then  one  can  solve  for  t  and  represent  the  solution  curves  as 
graphs  of  functions: 


v  \i/« 
V0/ 


y  =  yo 


X 

XQ ' 


Example  20. 7  The  three  systems 


X  =  X,  x  =  — X ,  X  =  X, 

y  —  2y ,  y  —  —2  y,  y  =  -2  y 


have  the  solutions 


x(t)  =  xoef, 
y(t)  =  y0e2t. 


x(t)  =  voe  r, 
y(t)  =  y0e~2t 


x(t)  =  voer, 

y(0  =  yoe~ 


respectively.  The  vector  fields  and  some  solutions  are  shown  in  Figs.  20.4,  20.5  and 
20.6.  One  recognises  that  all  coordinate  half  axes  are  solutions  curves. 


Type  II — double  real  eigenvalue,  not  diagonalisable.  The  case  of  a  double  real 
eigenvalue  a  =  [3  is  a  special  case  of  type  I,  if  the  coefficient  matrix  is  diagonalisable. 
There  is,  however,  the  particular  situation  of  a  double  eigenvalue  and  a  nilpotent  part. 
Then  the  standard  form  of  the  system  is 

x  =  ax  +  y,  x(0)  =  xo, 
y  =  j(0)  =  Jo- 


We  compute  the  solution  component 


j(0  =  joe*', 
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Fig.  20.5  Real  eigenvalues,  6 
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Fig.  20.6  Real  eigenvalues,  5 
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Fig.  20.7  Double  real 
eigenvalue,  matrix  not 
diagonalisable 
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substitute  it  into  the  first  equation 

x(t)  =  ax(t)  +  yotat ,  x  (0)  =  vo 
and  apply  the  variation  of  constants  formula  from  Chap.  19: 

x(t)  =  ea^vo  +  J  Q~asyoQas  =  eaf(vo  +  ryo)- 

The  vector  fields  and  some  solution  curves  for  the  case  a  =  —  1  are  depicted  in 
Fig.  20.7. 
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Type  III — complex  conjugate  eigenvalues.  In  this  case  the  standard  form  of  the 
system  is 

x  =  ax  —  fly,  x(0)  =  xo, 
y  =  (3x  +  ay,  7(0)  =  j0. 

By  introducing  the  complex  variable  z  and  the  complex  coefficients  7,  zo  as 

Z  =  X  +  iy,  7  =  a  +  i/3,  zo  =  x  o  +  i>’0- 

we  see  that  the  above  system  represents  the  real  and  the  imaginary  parts  of  the 
equation 

(x  +  ijO  =  (a  +  i  (3){x  +  ij),  x(0)  +  ij(0)  =  x0  +  ry0- 
From  the  complex  formulation 

z  =  yz,  z(0)  =  zo, 

the  solutions  can  be  derived  immediately: 

z(t)  =  Z0ZV  ■ 

Splitting  the  left-  and  right-hand  sides  into  real  and  imaginary  parts,  one  obtains 

x(t)  +  iy(t)  =  {xo  +  iyo)e{a+mt 

=  (xq  +  iyo)  eat  (cos  f5t  +  i  sin  (3t) . 


From  that  we  get  (see  Sect.  4.2) 

x(t)  =  xoeat  cos  fit  —  yoeat  sin  fit, 
y(t)  =  xo eat  sin  (3t  +  yo^at  cos  (3t. 

The  point  (x*,y*)  =  (0,  0)  is  again  an  equilibrium  point.  In  the  case  a  <  0  it  is 
asymptotically  stable;  for  a  >  0  it  is  unstable;  for  a  =  0  it  is  stable  but  not  asymp¬ 
totically  stable.  Indeed  the  solution  curves  are  circles  and  hence  bounded,  but  are 
not  attracted  by  the  origin  as  t  — >  00. 

Example  20.8  The  vector  fields  and  solutions  curves  for  the  two  systems 

x  =  jQX-y,  x  =  -±x-y, 

y  =  x  +  ^7,  y  =  X-  J^y 
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Fig.  20.8  Complex 
eigenvalues,  unstable 


Fig.  20.9  Complex 
eigenvalues,  asymptotically 
stable 
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are  given  in  Figs.  20.8  and  20.9.  For  the  stable  case  x  =  —y,  y  =  x  we  refer  to 
Fig.  20.1. 

General  solution  of  a  linear  system  of  differential  equations.  The  similarity  trans¬ 
formation  from  Appendix  B  allows  us  to  solve  arbitrary  linear  systems  of  differential 
equations  by  reduction  to  the  three  standard  cases. 

Proposition  20.9  For  an  arbitrary  (2  x  2) -matrix  A,  the  initial  value  problem 


has  a  unique  solution  that  exists  for  all  times  t  e  R.  This  solution  can  be  computed 
explicitly  by  transformation  to  one  of  the  types  I,  II  or  III. 


Proof  According  to  Appendix  B.2  there  is  an  invertible  matrix  T  such  that 

T“‘AT  =  B, 

where  B  belongs  to  one  of  the  standard  types  I,  II,  III.  We  set 


u 

v 


=  T 


-l 


x 

y 
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Fig.  20.10  Example  20. 10 
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and  obtain  the  transformed  system 


u 

• 

=  T_1 

X 

• 

11 

H 

1 

> 

X 

11 

H 

1 

> 

H 

u 

=  B 

u 

V 

_y_ 

y_ 

V 

V 

w(0) 

v(0) 


xo 

yo 


We  solve  this  system  of  differential  equations  depending  on  its  type,  as  explained 
above.  Each  of  these  systems  in  standard  form  has  a  unique  solution  which  exists  for 
all  times.  The  reverse  transformation 


X 

u 

=  T 

y_ 

V 

yields  the  solution  of  the  original  system.  □ 

Thus,  modulo  a  linear  transformation,  the  types  I,  II,  III  actually  comprise  all 
cases  that  can  occur. 

Example  20.10  We  study  the  solution  curves  of  the  system 


x  =  x  +  2  y, 
y  =  2x  +  y. 


The  corresponding  coefficient  matrix 


1  2 
2  1 


has  the  eigenvalues  Ai  =  3  and  A2  =  —  1  with  respective  eigenvectors  ei  =  [1  1]T 
and  e2  =  [—  1  1]T.  It  is  of  type  I,  and  the  origin  is  a  saddle  point.  The  vector  field 
and  some  solutions  can  be  seen  in  Fig.  20.10. 
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Remark  20.11  The  proof  of  Proposition  20.9  shows  the  structure  of  the  general 
solution  of  a  linear  system  of  differential  equations.  Assume,  for  example,  that  the 
roots  Ai,  A2  of  the  characteristic  polynomial  of  the  coefficient  matrix  are  real  and 
distinct,  so  the  system  is  of  type  I.  The  general  solution  in  transformed  coordinates 
is  given  by 

u(t)  =  Ciex>t,  v(t)  =  C2eX2t. 

If  we  denote  the  columns  of  the  transformation  matrix  by  ti ,  t2,  then  the  solution  in 
the  original  coordinates  is 


x(t) 

y(t ) 


=  ti  u(t)  + t2  v(t) 


?nCieAlf  +tuC2eX2t 
t2\C\QXlt  +  t22C2eX2' 


Every  component  is  a  particular  linear  combination  of  the  transformed  solutions 
u(t),  v(t).  In  the  case  of  complex  conjugate  roots  fi  ±  \a  (type  III)  the  components 
of  the  general  solution  are  particular  linear  combinations  of  the  functions  e^  cos  1 vt 
and  e^  sin  at.  In  the  case  of  a  double  root  a  (type  II),  the  components  are  given  as 
linear  combinations  of  the  functions  ear  and  teat . 


20.2  Systems  of  Nonlinear  Differential  Equations 

In  contrast  to  linear  systems  of  differential  equations,  the  solutions  to  nonlinear 
systems  can  generally  not  be  expressed  by  explicit  formulas.  Apart  from  numerical 
methods  (Chap.  21)  the  qualitative  theory  is  of  interest.  It  describes  the  behaviour  of 
solutions  without  knowing  them  explicitly.  In  this  section  we  will  demonstrate  this 
with  the  help  of  an  example  from  population  dynamics. 

The  Lotka-Volterra  model.  In  Sect.  20.1  the  predator-prey  model  of  Lotka  and 
Volterra  was  introduced.  In  order  to  simplify  the  presentation,  we  set  all  coefficients 
equal  to  one.  Thus  the  system  becomes 


x  =  x(y  -  1), 

y  =  y(  1  -■*)• 

The  equilibrium  points  are  (jc*,  y*)  =  (1,1)  and  (x**,  y**)  =  (0,  0).  Obviously,  the 
coordinate  half  axes  are  solution  curves  given  by 

x(t)  =  voe-/\  x(t)  =  0, 

y(t)  =  0,  y(t)  =  yoc'. 

The  equilibrium  (0,  0)  is  thus  a  saddle  point  (unstable);  we  will  later  analyse  the 
type  of  equilibrium  (1,  1).  In  the  following  we  will  only  consider  the  first  quadrant 
x  >  0,  y  >  0,  which  is  relevant  in  biological  models.  Along  the  straight  line  x  =  l 
the  vector  field  is  horizontal,  and  along  the  straight  line  y  =  1  it  is  vertical.  It  looks 
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Fig.  20.1 1  Vector  field  of 
the  Lotka-Volterra  model 


as  if  the  solution  curves  rotate  about  the  equilibrium  point  (1,  1),  see  Fig.  20.1 1. 

In  order  to  be  able  to  verify  this  conjecture  we  search  for  a  function  H(x,  y)  which 
is  constant  along  the  solution  curves: 

H(x(t),  y(t))  =  C. 


Such  a  function  is  called  a.  first  integral ,  invariant  or  conserved  quantity  of  the  system 
of  differential  equations.  Consequently,  we  have 


d 
d  t 


H(x(t),  y(t))  =  0 


or  by  the  chain  rule  for  functions  in  two  variables  (Proposition  15.16) 


dH  .  dH  . 

x  +  -7; — y  =  0. 


dx 


dy 


With  the  ansatz 

H(x ,  y)  =  F(x)  +  G(y), 

we  should  have 

F\x)x  +  Gr(y)y  =  0. 
Inserting  the  differential  equations  we  obtain 

F'(x)x(y  —  1)  +  G'(y)y(l  —  x)  =  0, 

and  a  separation  of  the  variables  yields 

xF'(x)  yG'(y) 


x  —  1 


y-1 
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Since  the  variables  v  and  y  are  independent  of  each  other,  this  is  only  possible  if 
both  sides  are  constant: 


xF'{x)  =c  y_G\y)  Q 


x  —  1 


y  —  i 


It  follows  that 


F\x)  =  C  (  1  -  IV  G\y)  =  CM  -  I 


and  thus 


H(x,  y)  =  C(x  -  log x  +  y  -  logy)  +  D. 

This  function  has  a  global  minimum  at  (x*,y*)  =  (1,  1),  as  can  also  be  seen  in 
Fig.  20.12. 


x 


Fig.  20.1 2  First  integral  and  level  sets 


The  solution  curves  of  the  Lotka-Volterra  system  lie  on  the  level  sets 

v  —  log  v  +  y  —  log  y  =  const. 

These  level  sets  are  obviously  closed  curves.  The  question  arises  whether  the  solution 
curves  are  also  closed,  and  the  solutions  thus  are  periodic.  In  the  following  proposition 
we  will  answer  this  question  affirmatively.  Periodic,  closed  solution  curves  are  called 
periodic  orbits. 

Proposition  20.12  For  initial  values  vo  >  0,  yo  >  0  the  solution  curves  of  the 
Lotka-Volterra  system  are  periodic  orbits  and  (x*,  y*)  =  (1,1)  is  a  stable  equi¬ 
librium  point. 


Outline  of  proof  The  proof  of  the  fact  that  the  solution 


x(t) 

V0)' 

v0 

y(t)_ 

V0)_ 

_yo_ 

t 
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exists  (and  is  unique)  for  all  initial  values  xo  >  0,  yo  >  0  and  all  times  t  eR  requires 
methods  that  go  beyond  the  scope  of  this  book.  The  interested  reader  is  referred  to 
[13,  Chap.  8].  In  order  to  prove  periodicity,  we  take  initial  values  (xo,  yo)  /  (1,  1) 
and  show  that  the  corresponding  solution  curves  return  to  the  initial  value  after  finite 
time  r  >  0.  For  that  we  split  the  first  quadrant  x  >  0,  y  >0  into  four  regions 

Ql  :  x  >  1,  y  >  1;  Q2  :  x  <  1,  y  >  1; 

g3  :  x  <  1,  y  <  1;  g4  :  x  >  1,  y  <  1 

and  show  that  every  solution  curve  moves  (clockwise)  through  all  four  regions  in 
finite  time.  For  instance,  consider  the  case  (xo,  yo)  e  <23,so0  <  xo  <  1,0  <  yo  <  1. 
We  want  to  show  that  the  solution  curve  reaches  the  region  Q2  in  finite  time;  i.e. 
y(t)  assumes  the  value  one.  From  the  differential  equations  it  follows  that 

x  =  x(y  -  1)  <  0,  y  =  y(l  -  x)  >  0 


in  region  g3  and  thus 

x(t)  <  x0,  y(0  >  yo,  y(0  >  yo(l  -  *o), 

as  long  as  ( x(t ),  y(t))  stays  in  region  g3.  If  y(t)  were  less  than  one  for  all  times 
t  >  0,  then  the  following  inequalities  would  hold: 

1  >  y(t)  =  yo  +  /  y(s)  ds  >  y0  +  /  yo(l  -  xo)  ds  =  y0  +  fyo(l  -  xo). 

Jo  Jo 

However,  the  latter  expression  diverges  to  infinity  as  t  oo,  a  contradiction.  Con¬ 
sequently,  y(t)  has  to  reach  the  value  1  and  thus  the  region  Q2  in  finite  time.  Like¬ 
wise  one  reasons  for  the  other  regions.  Thus  there  exists  a  time  r  >  0  such  that 
(x(t),  y(r))  =  (x0,  yo). 

From  that  the  periodicity  of  the  orbit  follows.  Since  the  system  of  differential 
equations  is  autonomous,  t  i->  (x(t  +  r),  y(t  +  r))  is  a  solution  as  well.  As  just 
shown,  both  solutions  have  the  same  initial  value  at  t  =  0.  The  uniqueness  of  the 
solution  of  initial  value  problems  implies  that  the  two  solutions  are  identical,  so 

x(t)  =  x(t  +  t),  y(t)  =  y(t  +  r) 

is  fulfilled  for  all  times  t  e  M.  However,  this  proves  that  the  solution  t  (x(t),  y(t )) 
is  periodic  with  period  r. 

All  solution  curves  in  the  first  quadrant  with  the  exception  of  the  equilibrium  are 
thus  periodic  orbits.  Solution  curves  that  start  close  to  (x*,  y*)  =  (1,1)  stay  close, 
see  Fig.  20.12.  The  point  (1,  1)  is  thus  a  stable  equilibrium.  □ 
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Fig.  20.1 3  Solution  curves 
of  the  Lotka-Volterra  model 


Figure  20. 13  shows  some  solution  curves.  The  populations  of  predator  and  prey 
thus  increase  and  decrease  periodically  and  in  opposite  direction.  For  further  popu¬ 
lation  models  we  refer  to  [6]. 


20.3  The  Pendulum  Equation 

As  a  second  example  of  a  nonlinear  system  we  consider  the  mathematical  pendulum . 
It  models  an  object  of  mass  m  that  is  attached  to  the  origin  with  a  (massless)  cord  of 
length  /  and  moves  under  the  gravitational  force  —mg,  see  Fig.  20.14.  The  variable 
x(t )  denotes  the  angle  of  deflection  from  the  vertical  direction,  measured  in  coun¬ 
terclockwise  direction.  The  tangential  acceleration  of  the  object  is  equal  to  lx(t), 
and  the  tangential  component  of  the  gravitational  force  is  —mg  sin  x(t).  According 
to  Newton’s  law,  force  =  mass  x  acceleration,  we  have 


—mg  sin  v  =  mix 


or 


mix  +  mg  sin  x  =  0. 

This  is  a  second-order  nonlinear  differential  equation.  We  will  later  reduce  it  to  a 
first-order  system,  but  for  a  start,  we  wish  to  derive  a  conserved  quantity. 

Conservation  of  energy.  Multiplying  the  pendulum  equation  by  lx  gives 

ml  xx  +  mglx  sin  v  =  0. 

We  identify  xx  as  the  derivative  of  \x2  and  x  sin  v  as  the  derivative  of  1  —  cos  v  and 
arrive  at  a  conserved  quantity,  which  we  denote  by  H(x,  x): 


—  H(x,x) 
at 


d 
d  t 


ml2x 2 


+  mgl{\ 


—  COS  V 
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Fig.  20.1 4  Derivation  of  the 
pendulum  equation 


that  is,  H(x(t),  x(t))  is  constant  when  x(t)  is  a  solution  of  the  pendulum  equation. 
Recall  from  mechanics  that  the  kinetic  energy  of  the  moving  mass  is  given  by 


1 


2-2 


T ( x )  =  -ml  x 


The  potential  energy  is  defined  as  the  work  required  to  move  the  mass  from  its  height 
—l  at  rest  to  position  —l  cos  x,  that  is 


U(x)  = 


l 


—l  cos* 
-/ 


mg  d£  =  mgl(l 


—  cos  v). 


Thus  the  conserved  quantity  is  identified  as  the  total  energy 


H(x,x)  =  T(x)  +  U(x), 


in  accordance  with  the  well-known  mechanical  principle  of  conservation  of  total 
energy. 

Note  that  the  linearisation 


sinv  =  v  +  0(x 3)  ~  v 
for  small  angles  v  leads  to  the  approximation 

mix  +  mgx  =  0. 

For  convenience,  we  will  cancel  m  in  the  equation  and  set  g/  /  =  1 .  Then  the  pendu¬ 
lum  equation  reads 

x  =  —  sin  v , 

with  the  conserved  quantity 


H(x ,  x) 


+  1  —  cos  v, 
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while  the  linearised  pendulum  equation  reads 

x  =  —x. 


Reduction  to  a  first-order  system.  Every  explicit  second-order  differential  equation 
x  =  f{x,  x)  can  be  reduced  to  a  first-order  system  by  introducing  the  new  variable 
y  —  x,  resulting  in  the  system 

x  =  y, 

y  =  f(x,y). 

Applying  this  procedure  to  the  pendulum  equation  and  adjoining  initial  data  leads 
to  the  system 

x  =  y,  v(0)  =  v0, 

y  =  —  sin  v ,  y(0)  =  yo 

for  the  mathematical  pendulum.  Here  v  denotes  the  angle  of  deflection  and  y  the 
angular  velocity  of  the  object. 

The  linearised  pendulum  equation  can  be  written  as  the  system 

x  =  y,  v(0)  =  *0, 
y  =  -x,  y(0)  =  y0. 

Apart  from  the  change  in  sign  this  system  of  differential  equations  coincides  with 
that  of  Example  20.1.  It  is  a  system  of  type  III;  hence  its  solution  is  given  by 


x(t)  =  vo  cos  t  +  yo  sin  t, 
y(t)  =  —vo  sin  t  +  yo  cos  t. 


The  first  line  exhibits  the  solution  to  the  second-order  linearised  equation  x  =  —x 
with  initial  data  v(0)  =  vo,  v(0)  =  yo-  The  same  result  can  be  obtained  directly  by 
the  methods  of  Sect.  19.6. 


Solution  trajectories  of  the  nonlinear  pendulum.  In  the  coordinates  (v,  y),  the 

total  energy  reads 


H(x,  y) 


+  1  —  COS  V. 


As  was  shown  above,  it  is  a  conserved  quantity;  hence  solution  curves  for  prescribed 
initial  values  (vo,  yo)  lie  on  the  level  sets  H(x,  y)  =  C;  i.e. 


1 

2 


+  1 


—  COS  V 


1  2 

-Jo  +  1  -  COS  Xq, 


2  cos  vo  +  2  cos  v . 
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Fig.  20.1 5  Solution  curves, 
mathematical  pendulum 
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2 
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Figure  20. 15  shows  some  solution  curves.  There  are  unstable  equilibria  at  y  =  0, 
v  =  ...,  —37 r,  —  7r,  7T,  37t,  ...  which  are  connected  by  limit  curves.  One  of  the  two 
limit  curves  passes  through  xq  =  0,  yo  =  2.  The  solution  with  these  initial  values 
lies  on  the  limit  curve  and  approaches  the  equilibrium  (tt,  0)  as  t  — >  oo,  and  (-7 T,  0) 
as  t  ->  —oo.  Initial  values  that  lie  between  these  limit  curves  (for  instance  the  values 
xq  =  0,  |yo  I  <  2)  give  rise  to  periodic  solutions  of  small  amplitude  (less  than  tt).  The 
solutions  outside  represent  large  oscillations  where  the  pendulum  loops.  We  remark 
that  effects  of  friction  are  not  taken  into  account  in  this  model. 

Power  series  solutions.  The  method  of  power  series  for  solving  differential  equations 
has  been  introduced  in  Chap.  19.  We  have  seen  that  the  linearised  pendulum  equation 
x  =  —x  can  be  solved  explicitly  by  the  methods  of  Sects.  19.6  and  20.1.  Also,  the 
nonlinear  pendulum  equation  can  be  solved  explicitly  with  the  aid  of  certain  higher 
transcendental  functions,  the  Jacobian  elliptic  functions.  Nevertheless,  it  is  of  interest 
to  analyse  the  solutions  of  these  equations  by  means  of  powers  series,  especially  in 
view  of  the  fact  that  they  can  be  readily  obtained  in  maple. 

Example  20.13  (Power  series  for  the  linearised  pendulum)  As  an  example,  we  solve 
the  initial  value  problem 


x  =  —v,  v(0)  =  a ,  i(0)  =  0 


by  means  of  the  power  series  ansatz 


oo 


n=0 


oo 


x (^)  —  ^  ^  ncntn  1  —  c\  - P  2c2t  T  2>cyt^  T  4c4^2  T  ■  ■  ■ 

n — 1 


oo 


x(t)  =  ^^n(n  —  1  )cntn  2  =  2c2  +  6c^t  +  12c4^2  +  •  •  • 

n —2 


We  have 
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We  know  that  co  =  a  and  c\  =  0.  Equating  x(t)  with  —x(t)  gives,  up  to  second 
degree, 

2c2  +  6c3f  H-  12c4f ^  +  •  •  •  =  — a  —  —  •  •  • 


thus 

a 

C2  =  ~~,  C3=  0, 

The  power  series  expansion  starts  with 


1 

24 


t4  - p  . . . 


and  seemingly  coincides  with  the  Taylor  series  of  the  known  solution  x{t)  =  a  cos  t . 


Example  20.14  (Power  series  for  the  nonlinear  pendulum)  We  turn  to  the  initial 
value  problem  for  the  nonlinear  pendulum  equation 

x  =  —  sin  x,  x(0)  =  a,  i(0)  =  0, 

making  the  same  power  series  ansatz  as  in  Example  20.13.  Developing  the  sine 
function  into  its  Taylor  series,  inserting  the  lowest  order  terms  of  the  power  series  of 
x(t)  and  noting  that  co  =  a,  c\  =0  yields 

-sin x(t)  =  -ix(t)  -  +  •••') 

=  —(a  +  --•)  +  —(a  +  C2t^  +  . . .  )  —  5T T  •  •  • ) 

=  —(a  C2t^  +  •••)  +  —  ( a ^  +  3a^C2t^  +  •  •  • ) 

- ( a ^  T  5a4 C2^  T  ■  ■  ■  ) , 

120 v  ; 

where  we  have  used  the  binomial  formulas.  Equating  the  last  line  with 

3c(f)  —  2c2  T  foc3t  T  \2c/\.t^  T  •  •  • 


shows  that 


2c2 
6c  3 
12c4 


1 


1 


—a  H — a - 

6  120 


a5  zb 


=  0 


,  3  9  5  A 

C2  (  —  1  H — a - a  it 

1  6  120 


which  suggests  that 


1 


1 
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Collecting  terms  and  factoring  a  out  finally  results  in  the  expansion 


x(t)  =  a  I  1 - 


1  sin  a 


2  a 


r  + 


1  sin  a  cos  a 


24 


r  ±  . . . 


a 


The  expansion  can  be  checked  by  means  of  the  maple  command 

ode : =dif f (x ( t)  ,  [ t  $  2 ] ) =-sin (x ( t) ) 

ics : =x ( 0 ) =a , D (x) (0)=0 

dsolve ( {ode, ics }  ,  x(t),  series)  ; 

If  the  initial  deflection  xq  =  a  is  sufficiently  small,  then 


sin  a 


-  ^  1,  cos  a  ss  1 . 

a 

see  Proposition  6. 10,  and  so  the  solution  x(t)  is  close  to  the  solution  a  cos  t  of  the 
linearised  pendulum  equation,  as  expected. 


20.4  Exercises 

1.  The  space-time  diagram  of  a  two-dimensional  system  of  differential  equations 
(Remark  20.2)  can  be  obtained  by  introducing  time  as  third  variable  z(t)  =  t 
and  passing  to  the  three-dimensional  system 


X 

fix,  y) 

• 

y 

— 

g(x,  y ) 

• 

_z_ 

1 

Use  this  observation  to  visualise  the  systems  from  Examples  20.1  and  20.3. 
Study  the  time-dependent  solution  curves  with  the  applet  Dynamical  systems  in 
space. 

2.  Compute  the  general  solutions  of  the  following  three  systems  of  differential 
equations  by  transformation  to  standard  form: 

X  =  x  =  —3 y,  x  =  \x  -  fy, 

y  =  ~tx  ~  b’  y  =  x,  y  =  ^x  +  ).y' 

Visualise  the  solution  curves  with  the  applet  Dynamical  systems  in  the  plane. 

3.  Small,  undamped  oscillations  of  an  object  of  mass  m  attached  to  a  spring  are 
described  by  the  differential  equation  mx  +  kx  =  0.  Here,  v  =  x(t)  denotes  the 
displacement  from  the  position  of  rest  and  k  is  the  spring  stiffness.  Introduce 
the  variable  y  =  x  and  rewrite  the  second-order  differential  equation  as  a  linear 
system  of  differential  equations.  Find  the  general  solution. 
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4.  A  company  deposits  its  profits  in  an  account  with  continuous  interest  rate  a%. 
The  balance  is  denoted  by  x(t).  Simultaneously  the  amount  y(t)  is  withdrawn 
continuously  from  the  account,  where  the  rate  of  withdrawal  is  equal  to  b%  of 
the  account  balance.  With  r  =  0/ 100,  s  =  Z? / 100  this  leads  to  the  linear  system 
of  differential  equations 


*(0  =  r(x(t)  -  y(t)), 
y(t)  =  sx(t). 

Find  the  solution  (x(t),  y(t))  for  the  initial  values  v(0)  =  1,  y  (0)  =  0  and  anal¬ 
yse  how  big  s  can  be  in  comparison  with  r  so  that  the  account  balance  x(t)  is 
increasing  for  all  times  without  oscillations. 

5.  A  national  economy  has  two  sectors  (for  instance  industry  and  agriculture)  with 
the  production  volumes  x\(t ),  X2 (t)  at  time  t.  If  one  assumes  that  the  invest¬ 
ments  are  proportional  to  the  respective  growth  rate,  then  the  classical  model  of 
Leontief3  [24,  Chap.  9.5]  states 

X\(t)  =  a\\X\(t)  +  012*2(0  +  £l*l(0  +  c\(t), 

*2(0  =  021*1  (0  +  022*2(0  +  ^2*2(0  +  02(0- 

Here  a\j  denotes  the  required  amount  of  goods  from  sector  i  to  produce  one 
unit  of  goods  in  sector  j .  Further  £/*/(0  are  the  investments,  and  c/(0  is  the 
consumption  in  sector  i.  Under  the  simplifying  assumptions  011  =  022  =  0, 
012  =  021  =  0  (0  <  0  <  1),  b\  =  Z?2  =  1,  ci  (0  =  02O)  =  0  (no  consumption) 
one  obtains  the  system  of  differential  equations 

*i(0  =  *1  (0  —  0*2  (0» 

*2(0  =  -0*1  (0  +*2(0- 

Find  the  general  solution  and  discuss  the  result. 

6.  Use  the  applet  Dynamical  systems  in  the  plane  to  analyse  the  solution  curves 
of  the  differential  equations  of  the  mathematical  pendulum  and  translate  the 
mathematical  results  to  statements  about  the  mechanical  behaviour. 

7.  Derive  the  conserved  quantity  H (x ,  y)  =  ^y2  +  1—  cosv  of  the  pendulum 
equation  by  means  of  the  ansatz  H(x ,  y)  =  F(x)  +  G(y)  as  for  the  Lotka- 
Volterra  system. 

8.  Using  maple,  find  the  power  series  solution  to  the  nonlinear  pendulum  equation 
x  =  —  sin  v  with  initial  data 

x(0)  =  0,  i(0)  =  0  and  v(0)  =  0,  i(0)  =  b. 

Check  by  how  much  its  coefficients  differ  from  the  ones  of  the  power  series 
solution  of  the  corresponding  linearised  pendulum  equation  x  =  —x  for  various 
values  of  0,  b  between  0  and  1. 


3  W.  Leontief,  1906-1999. 
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9.  The  differential  equation  mx(t)  +  kx(t)  +  2 cx3(t)  =  0  describes  a  nonlinear 
mass-spring  system  where  x(t)  is  the  displacement  of  the  mass  m,  k  is  the 
stiffness  of  the  spring  and  the  term  cx 3  models  nonlinear  effects  (c  >  0  . . . 
hardening,  c  <  0 . . .  softening). 

(a)  Show  that 

H(x,  x)  =  -  ( mx 2  +  kx 2  +  cx4) 
is  a  conserved  quantity. 

(b)  Assume  that  m  =  1,  k  =  1  and  x(0)  =  0,  x(0)  =  1.  Reduce  the  second- 
order  equation  to  a  first-order  system.  Making  use  of  the  conserved  quantity, 
plot  the  solution  curves  for  the  values  of  c  =  0,c  =  —  0.2,c  =  0.2  and  c  =  5. 
Hint.  A  typical  maple  command  is 

with (plots , implicitplot ) ;  c:=5; 

implicitplot  (y/v2+x"2+c*x/v4  =  l ,  x=-l .  5  .  .1.5,y=-1.5.  .1.5)  ; 

10.  Using  maple,  find  the  power  series  solution  to  the  nonlinear  differential  equation 
x(t)  +  x(t)  +  2 cx3(t)  =  0  with  initial  data  x(0)  =  a,  i(0)  =  b.  Compare  it  to 
the  solution  with  c  =  0. 
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Numerical  Solution  of  Differential 
Equations 


As  we  have  seen  in  the  last  two  chapters,  only  particular  classes  of  differential 
equations  can  be  solved  analytically.  Especially  for  nonlinear  problems  one  has  to 
rely  on  numerical  methods. 

In  this  chapter  we  discuss  several  variants  of  Euler’s  method  as  a  prototype.  Moti¬ 
vated  by  the  Taylor  expansion  of  the  analytical  solution  we  deduce  Euler  approxi¬ 
mations  and  study  their  stability  properties.  In  this  way  we  introduce  the  reader  to 
several  important  aspects  of  the  numerical  solution  of  differential  equations.  We  point 
out,  however,  that  for  most  real-life  applications  one  has  to  use  more  sophisticated 
numerical  methods. 


21 .1  The  Explicit  Euler  Method 

The  differential  equation 

y'(x)  =  f(x,  y(x)) 

defines  the  slope  of  the  tangent  to  the  solution  curve  y(x).  Expanding  the  solution 
at  the  point  x  -\-h  into  a  Taylor  series 

y(x  +  h)  =  y(x)  +  hy\x)  +  0(h2) 

and  inserting  the  above  value  for  y'{x),  one  obtains 

y(x  +  h)  =  y(x)  +  hf(x,  y(x))  +  OQi 2) 
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and  consequently  for  small  h  the  approximation 

y(x  +  h)  ss  y(x)  +  hf(x,  y(x)). 

This  observation  motivates  the  (explicit)  Euler  method . 

Euler’s  method.  For  the  numerical  solution  of  the  initial  value  problem 

y'(x)  =  f(x,  y(x ;)),  y(a)  = 

on  the  interval  [a,  b ]  we  first  divide  the  interval  into  N  parts  of  length  h  =  (b  —  a) /N 
and  define  the  grid  points  xj  =  vq  +  jh ,  0  <  j  <  N,  see  Fig.  21.1. 


xq  =  a  x\  X2  •••  xn  =  xo  +  Nh  =  b 

Fig.  21 .1  Equidistant  grid  points  xj  =  xq  +  jh 


The  distance  h  between  two  grid  points  is  called  step  size.  We  look  for  a  numerical 
approximation  yn  to  the  exact  solution  y  (xn)  at  xn,  i.e.  yn  ~  y  (xn).  According  to  the 
considerations  above  we  should  have 

y(x„+ 1)  ~  y(x„)  +  hf(xn,  y(xn)). 

If  one  replaces  the  exact  solution  by  the  numerical  approximation  and  by  =, 
then  one  obtains  the  explicit  Euler  method 


yn-\- 1  —  yn  T  hf  ( xn ,  yn ), 


which  defines  the  approximation  yn+\  as  a  function  of  yn. 

Starting  from  the  initial  value  yo  one  computes  from  this  recursion  the  approxi¬ 
mations  yi,y2,---,yN  ~  y(fi).  The  points  (xi ,  yt)  are  the  vertices  of  a  polygon  which 
approximates  the  graph  of  the  exact  solution  y(x).  Figure  21.2  shows  the  exact  solu¬ 
tion  of  the  differential  equation  y'  =  y,  y(0)  =  1  as  well  as  polygons  defined  by 
Euler’s  method  for  three  different  step  sizes. 

Euler’s  method  is  convergent  of  order  1,  see  [11,  Chap.  II.  3].  On  bounded  intervals 
[a ,  b]  one  thus  has  the  uniform  error  estimate 

I y(xn)  -yn\<Ch 

for  all  n  >  1  and  sufficiently  small  h  with  0  <  nh  <  b  —  a.  The  constant  C  depends 
on  the  length  of  the  interval  and  the  solution  y(x),  however,  it  does  not  depend  on  n 
and  h. 
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Fig.  21.2  Euler 
approximation  to 
/  =  y,  y(0)  =  1 


Example  21.1  The  solution  of  the  initial  value  problem  y'  =  y,  y  (0)  =  1  is  y(v)  = 
eh  For  nh  =  1  the  numerical  solution  yn  approximates  the  exact  solution  at  x  =  1. 
Due  to 


yn  =  yn- 1  +  hyn- 1  =  (1  +  h)yn-\  =  •••  =  (!+  hfy  o 


we  have 


Tw  —  (1  +  h)n 


(-I)'- 


The  convergence  of  Euler’s  method  thus  implies 

(  l\n 

e  =  lim  I  1  H — 

n^oo  y  n  ) 

This  formula  for  e  was  already  deduced  in  Example  7.11. 

In  commercial  software  packages,  methods  of  higher  order  are  used  for  the  numer¬ 
ical  integration,  for  example  Runge-Kutta  or  multi-step  methods.  All  these  methods 
are  refinements  of  Euler’s  method.  In  modern  implementations  of  these  algorithms 
the  error  is  automatically  estimated  and  the  step  size  adaptively  adjusted  to  the  prob¬ 
lem.  For  more  details,  we  refer  to  [11, 12]. 

Experiment  21.2  In  MATLAB  you  can  find  information  on  the  numerical  solution  of 
differential  equations  by  calling  help  fun  fun.  For  example,  one  can  solve  the 
initial  value  problem 


y(0)  =  0.9 
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on  the  interval  [0,  1]  with  the  command 

[x,y]  =  ode23 ( ' qfun ' ,  [0,1],  0.9); 

The  file  qfun .  m  has  to  contain  the  definition  of  the  function 

function  yp  =  f(x,y) 
y  p  =  y .  "  2  ; 

For  a  plot  of  the  solution,  one  sets  the  option 

myopt  =  odeset ( ' OutputFcn ' , ' odeplot ' ) 
and  calls  the  solver  by 

[x,y]  =  ode23 ( ' qfun ' ,  [0,1],  0.9,  myopt); 

Start  the  program  with  different  initial  values  and  observe  the  blow  up  for  y  (0)  >  1 . 

21 .2  Stability  and  Stiff  Problems 

The  linear  differential  equation 

/  =  ay,  y(  0)  =  1 

has  the  solution 

y(x)  =  Qax. 

For  a  <  0  this  solution  has  the  following  qualitative  property,  independent  of  the 
size  of  a : 

| y  (x)|  <  1  for  all  v  >  0. 

We  are  investigating  whether  numerical  methods  preserve  this  property.  For  that  we 
solve  the  differential  equation  with  the  explicit  Euler  method  and  obtain 

yn  =  y,i- 1  +  hay„- 1  =  (1  +  ha)yn-\  =  •  •  •  =  (1  +  ha)ny0  =  (1  +  ha)n . 

For  —2<ha<0  the  numerical  solution  obeys  the  same  bound 

\yn\  =  1(1  +  ha)n  =  1  +  ha  n  <  1 
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as  the  exact  solution.  However,  for  ha  <  —2a  dramatic  instability  occurs  although 
the  exact  solution  is  harmless.  In  fact,  all  explicit  methods  have  the  same  difficulties 
in  this  situation:  The  solution  is  only  stable  under  very  restrictive  conditions  on  the 
step  size.  For  the  explicit  Euler  method  the  condition  for  stability  is 


—2  <  ha  <  0. 


For  a  <<C  0  this  implies  a  drastic  restriction  on  the  step  size,  which  eventually  makes 
the  method  in  this  situation  inefficient. 

In  this  case  a  remedy  is  offered  by  implicit  methods,  for  example,  the  implicit 
Euler  method 


yn+ 1  =yn+  hf{xn+ 1,  yn+i)- 

It  differs  from  the  explicit  method  by  the  fact  that  the  slope  of  the  tangent  is  now 
taken  at  the  endpoint.  For  the  determination  of  the  numerical  solution,  a  nonlinear 
equation  has  to  be  solved  in  general.  Therefore,  such  methods  are  called  implicit. 
The  implicit  Euler  method  has  the  same  accuracy  as  the  explicit  one,  but  by  far  better 
stability  properties,  as  the  following  analysis  shows.  If  one  applies  the  implicit  Euler 
method  to  the  initial  value  problem 

y'  —  ay,  j(0)  =  1,  with  a  <  0, 


one  obtains 


and  therefore 


yn  =  yn- 1  +  hf(xn,  yn)  =  yn- 1  +  hayn. 


1 


yn  = 


1  —  ha 


yn- 1  =  •  •  •  = 


(1  -  ha) 


1  1 

Jo  = 


n 


(l -ha) 


n 


The  procedure  is  thus  stable,  i.e.  \yn\  <  1,  if 


(1  -  ha)n  >  1 


However,  for  a  <  0  this  is  fulfilled  for  all  h  >  0.  Thus  the  procedure  is  stable  for 
arbitrarily  large  step  sizes. 


Remark  21.3  A  differential  equation  is  called  stiff \  if  for  its  solution  the  implicit 
Euler  method  is  more  efficient  (often  dramatically  more  efficient)  than  the  explicit 
method. 
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n  =  250 

n  =  248 

1 

1 

1  - 

0 

0 

o  - 

-1 

-1 

-1  - 

-2 

_ i _ 

-2 

_ i _ 

-2  - 

0 


10  0 


10  0 


10 


Fig.  21 .3  Instability  of  the  explicit  Euler  method.  In  each  case  the  picture  shows  the  exact  solution 
and  the  approximating  polygons  of  Euler’s  method  with  n  steps 


Fig.  21.4  Stability  of  the  implicit  Euler  method.  In  each  case  the  picture  shows  the  exact  solution 
and  the  approximating  polygons  of  Euler’s  method  with  n  steps 


Example  21.4  ( From  [12,  Chap.  IV.  1])  We  integrate  the  initial  value  problem 

y'  =  —  50(y  —  cosv),  y(  0)  =  0.991. 


Its  exact  solution  is 


2500 

y(x)  = _ cos  v  + 


50 


6503 


sinv  — 


-50x 


2501  2501 

cos(v  —  0.02)  —  0.0026  e 


250100 

— 50x 


The  solution  looks  quite  harmless  and  resembles  cos  x,  but  the  equation  is  stiff  with 
a  =  —50.  Warned  by  the  analysis  above  we  expect  difficulties  for  explicit  methods. 

We  integrate  this  differential  equation  numerically  on  the  interval  [0,  10]  with 
the  explicit  Euler  method  and  step  sizes  h  =  10 /n  with  n  =  250,  248  and  246. 
For  n  <  250,  i.e.  h  >  1/25,  exponential  instabilities  occur,  see  Fig.  21.3.  This  is 
consistent  with  the  considerations  above  because  the  product  ah  satisfies  ah  <  —  2 
for  h  >  1/25. 

However,  if  one  integrates  the  differential  equation  with  the  implicit  Euler  method, 
then  even  for  very  large  step  sizes  no  instabilities  arise,  see  Fig.  21.4.  The  implicit 
Euler  method  is  more  costly  than  the  explicit  one,  as  the  computation  of  yn+\  from 

yn+ 1  =yn+  hf(xn+i,yn+\) 

generally  requires  the  solution  of  a  nonlinear  equation. 
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For  the  derivation  of  a  simple  numerical  method  for  solving  systems  of  differential 
equations 

x(t)  =  f(t,x(t),  y(0),  x(to)  =  xo, 

y(t)  =  g(t,x(t),  y(t)),  y(to)  =  yo, 

one  again  starts  from  the  Taylor  expansion  of  the  analytic  solution 

x(t  +  h)  =  x(t)  +  hx(t)  +  0(h2), 
y(t  +  h)  —  y(t)  +  hy(t )  +  0(h2) 

and  replaces  the  derivatives  by  the  right-hand  sides  of  the  differential  equations.  For 
small  step  sizes  h  this  motivates  the  explicit  Euler  method 

%n+ 1  =  Xn  T  hf  yn  ?  %n  ■>  yn  )  > 

yn+ 1  =  yn  +  hg(tn,xn,  yn). 

One  interprets  xn  and  yn  as  numerical  approximations  to  the  exact  solution  v  (tn)  and 
y(tn )  at  time  tn  =  to  +  nh. 

Example  21.5  In  Sect.  20.2  we  have  investigated  the  Lotka-Volterra  model 

x  =x(y  -  1), 

y  =  y( i  -^)- 

In  order  to  compute  the  periodic  orbit  through  the  point  (*o ,  yo)  =  (2,2)  numerically, 
we  apply  the  explicit  Euler  method  and  obtain  the  recursion 


xn+i  =*n  +  hxn(yn  -  1), 
yn+ 1  =  yn  +  hyn(  1  -  xn). 

Starting  from  the  initial  values  xo  =  2  and  yo  =  2  this  recursion  determines  the 
numerical  solution  for  n  >  0.  The  results  for  three  different  step  sizes  are  depicted 
in  Fig.  21.5.  Note  the  linear  convergence  of  the  numerical  solution  for  h  — >  0. 

This  numerical  experiment  shows  that  one  has  to  choose  a  very  small  step  size  in 
order  to  obtain  the  periodicity  of  the  true  orbit  in  the  numerical  solution.  Alternatively, 
one  can  use  numerical  methods  of  higher  order  or — in  the  present  example — also 
the  following  modification  of  Euler’s  method 


xn+i  =xn+  hxn(yn  -  1), 

yn+ 1  =  yn  +  hyn{\  Xn+\). 
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Fig.  21. 5  Numerical  computation  of  a  periodic  orbit  of  the  Lotka-Volterra  model.  The  system  was 
integrated  on  the  interval  0  <  t  <  14  with  Euler’s  method  and  constant  step  sizes  h  =  14 /n  for 
n  =  250,  500  and  1000 


Fig.  21.6  Numerical  computation  of  a  periodic  orbit  of  the  Lotka-Volterra  model.  The  system 
was  integrated  on  the  interval  0  <  t  <  14  with  the  modified  Euler  method  with  constant  step  sizes 
h  =  14 /n  for  n  =  50,  100  and  200 


In  this  method  one  uses  instead  of  xn  the  updated  value  xn+\  for  the  computation 
of  yn+ 1.  The  numerical  results,  obtained  with  this  modified  Euler  method,  are  given 
in  Fig.  21.6.  One  clearly  recognises  the  superiority  of  this  approach  compared  to  the 
original  one.  Clearly,  the  geometric  structure  of  the  solution  was  better  captured. 


21.4  Exercises 

1.  Solve  the  special  Riccati  equation  )/  =  x2  +  y2,  y(0)  =  — 4for0  <  v  <  2  with 
MATLAB. 

2.  Solve  with  MATLAB  the  linear  system  of  differential  equations 

x  =  y,  y  =  —x 

with  initial  values  v(0)  =  1  and  y(0)  =  0  on  the  interval  [0,  b]  for  b  =  In ,  107T 
and  200tt.  Explain  the  observations. 

Hint.  In  MATLAB  one  can  use  the  command  ode 2  3  (  ,mat21_lA  ,  [0  2*pi]  , 
[0  1  ]  ) ,  where  the  file  mat 2 1_1 .  m  defines  the  right-hand  side  of  the  differ¬ 
ential  equation. 
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3.  Solve  the  Lotka-Volterra  system 


x  =  x(y  -  1),  y  =  y(  1  -  x) 


for  0  <  t  <  14  with  initial  values  x(0)  =  2  and  y(0)  =  2  in  MATLAB.  Compare 
your  results  with  Figs.  21.5  and  21.6. 

4.  Let  y'(x )  =  f(x,y(x)).  Show  by  Taylor  expansion  that 


y(x  +  h)  =  y(x)  +  hf  y(x)  +  ^/(x,  y(x))J  +  0(h3) 


and  deduce  from  this  the  numerical  scheme 


h  h 

yn+ 1  =  J/i  +  hf  (  xn  +  yn  +  -f(xn,  yn) 


Compare  the  accuracy  of  this  scheme  with  that  of  the  explicit  Euler  method 
applied  to  the  Riccati  equation  of  Exercise  1 . 

5.  Apply  the  numerical  scheme 


(  h  h  \ 

yn+ 1  =yn+hf  I  xn  +  -,yn  +  -f(xn,  yn)  I 

to  the  solution  of  the  differential  equation 

yr  =  y,  y(0)  =  1 

and  show  that 

/  h2\n 

—  +  —  )  • 

Deduce  from  this  identity  a  formula  for  approximating  e.  How  do  the  results 
compare  to  the  corresponding  formula  obtained  with  the  explicit  Euler  scheme? 
Hint:  Choose  h  =  l/n  for  n  =  10,  100,  1000,  10000. 

6.  Let  a  <  0.  Apply  the  numerical  scheme 


/  h  h  \ 

yn+ 1  =yn+hf  I  Xn  +  yn  +  -f(xn,  yn)  I 

to  the  linear  differential  equation  y’  =  ay,  y  (0)  =  1  and  find  a  condition  on  the 
step  size  h  such  that  \yn\  <  1  for  all  n  e  N. 


Vector  Algebra 


A 


In  various  sections  of  this  book  we  referred  to  the  notion  of  a  vector.  We  assumed 
the  reader  to  have  a  basic  knowledge  on  standard  school  level.  In  this  appendix  we 
recapitulate  some  basic  notions  of  vector  algebra.  For  a  more  detailed  presentation 
we  refer  to  [2]. 


A.1  Cartesian  Coordinate  Systems 

A  Cartesian  coordinate  system  in  the  plane  (in  space)  consists  of  two  (three)  real 
lines  (coordinate  axes)  which  intersect  in  right  angles  at  the  point  O  (origin).  We 
always  assume  that  the  coordinate  system  is  positively  (right-handed)  oriented.  In  a 
planar  right-handed  system,  the  positive  y-axis  lies  to  the  left  in  viewing  direction  of 
the  positive  v-axis  (Fig.  A.l).  In  a  positively  oriented  three-dimensional  coordinate 
system,  the  direction  of  the  positive  z-axis  is  obtained  by  turning  the  v-axis  in  the 
direction  of  the  y-axis  according  to  the  right-hand  rule ,  see  Fig.  A. 2. 

The  coordinates  of  a  point  are  obtained  by  parallel  projection  of  the  point  onto 
the  coordinate  axes.  In  the  case  of  the  plane,  the  point  A  has  the  coordinates  a\  and 
<22,  and  we  write 

A  =  ((2l,  (22)  £  M2. 

In  an  analogous  way  a  point  A  in  space  with  coordinates  a\ ,  (22  and  (23  is  denoted  as 

A  =  ((21,  (22,  (23)  e  M3. 

Thus  one  has  a  unique  representation  of  points  by  pairs  or  triples  of  real  numbers. 
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Fig.A.1  Cartesian 
coordinate  system  in  the 
plane 


V 


n 


CL2 


A 


-i - ► 

a±  x 


Fig.A.2  Cartesian 
coordinate  system  in  space 


A.2  Vectors 

For  two  points  P  and  Q  in  the  plane  (in  space)  there  exists  exactly  one  parallel 
translation  which  moves  P  to  Q.  This  translation  is  called  a  vector.  Vectors  are  thus 
quantities  with  direction  and  length.  The  direction  is  that  from  P  to  Q  and  the  length 
is  the  distance  between  the  two  points.  Vectors  are  used  to  model,  e.g.,  forces  and 
velocities.  We  always  write  vectors  in  boldface. 

For  a  vector  a,  the  vector  —a  denotes  the  parallel  translation  which  undoes  the 
action  of  a;  the  zero  vector  0  does  not  cause  any  translation.  The  composition  of 
two  parallel  translations  is  again  a  parallel  translation.  The  corresponding  operation 
for  vectors  is  called  addition  and  is  performed  according  to  the  parallelogram  rule. 
For  a  real  number  A  >  0,  the  vector  A  a  is  the  vector  which  has  the  same  direction 
as  a,  but  A  times  the  length  of  a.  This  operation  is  called  scalar  multiplication.  For 
addition  and  scalar  multiplication  the  usual  rules  of  computation  apply. 

Let  a  be  the  parallel  translation  from  P  to  Q.  The  length  of  the  vector  a,  i.e.  the 
distance  between  P  and  Q ,  is  called  norm  (or  magnitude)  of  the  vector.  We  denote 
it  by  || a  || .  A  vector  e  with  ||e||  =  1  is  called  a  unit  vector. 


A.3  Vectors  in  a  Cartesian  Coordinate  System 

In  a  Cartesian  coordinate  system  with  origin  O ,  we  denote  the  three  unit  vectors  in 
direction  of  the  three  coordinate  axes  by  ei,  ez,  £3,  see  Fig.  A.3.  These  three  vectors 
are  called  the  standard  basis  of  M3.  Here  ei  stands  for  the  parallel  translation  which 
moves  O  to  (1,  0,  0),  etc. 
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Fig.  A.3  Representation  of  a 
in  components 


The  vector  a  which  moves  O  to  A  can  be  decomposed  in  a  unique  way  as  a  = 
a\e\  +  a2e2  +  ^3^3.  We  denote  it  by 


a  = 


a\ 

C12 

a3 


where  the  column  on  the  right-hand  side  is  the  so-called  coordinate  vector  of  a  with 
respect  to  the  standard  basis  ei,  e2,  e3.  The  vector  a  is  also  called  position  vector 
of  the  point  A.  Since  we  are  always  working  with  the  standard  basis,  we  identify  a 
vector  with  its  coordinate  vector,  i.e. 


and 


1 

H 

i 

o 

i _ 

o 

i _ 

0 

,  e2  = 

1 

,  e3  = 

0 

o 

i _ 

i 
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<21 

0 

0 

<21 

a  =  a\e\  +  <22e2  +  <23e3  = 

0 

+ 

<22 

+ 

0 

— 

<22 

0 

0 

<23 

<23 

To  distinguish  between  points  and  vectors  we  write  the  coordinates  of  points  in  a 
row,  but  use  column  notation  for  vectors. 

For  column  vectors  the  usual  rules  of  computation  apply: 


<2l 

b\ 

<21  +  b\ 

<2i 

A<2i 

<22 

+ 

b2 

— 

<22  +  b2 

,  A 

<22 

— 

A<22 

_<23_ 

_h_ 

_<23  +  b3_ 

_<23_ 

_A<23_ 

Thus  the  addition  and  the  scalar  multiplication  are  defined  componentwise. 
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The  norm  of  a  vector  a  e  Mr  with  components  a\  and  <22  is  computed  with 

Pythagoras’  theorem  as  ||a||  =  J a\  +  Hence  the  components  of  the  vector  a 
have  the  representation 


a\  =  || a ||  •  cos cr  and  <22  =  ||a||  •  sin cr, 


and  we  obtain 


a  =  a 


cos  a 
sin  a 


=  length  •  direction, 


see  Fig.A.4.  For  the  norm  of  a  vector  a  e  M3  the  analogous  formula  ||a| 
y I a\  ^  a\Jr  holds. 


Remark  A.  1  The  plane  R2  (and  likewise  the  space  M3)  appears  in  two  roles:  On  the 
one  hand  as  point  space  (its  objects  are  points  which  cannot  be  added)  and  on  the 
other  hand  as  vector  space  (its  objects  are  vectors  that  can  be  added).  By  parallel 
translation,  M2  (as  vector  space)  can  be  attached  to  every  point  of  M2  (as  point  space), 
see  Fig.  A.5.  In  general,  however,  point  space  and  vector  space  are  different  sets,  as 
shown  in  the  following  example. 


Example  A.  2  (Particle  on  a  circle)  Let  P  be  the  position  of  a  particle  which  moves 
on  a  circle  and  v  its  velocity  vector.  Then  the  point  space  is  the  circle  and  the  vector 
space  the  tangent  to  the  circle  at  the  point  P,  see  Fig.  A. 6. 


Fig.A.4  A  vector  a  with  its 
components  a\  and  CL2 


Fig.  A.5  Force  F 
applied  at  P 


F 
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Fig.  A.6  Velocity  vector  is 
tangential  to  the  circle 


A.4  The  Inner  Product  (Dot  Product) 

The  angle  Z(a,  b)  between  two  vectors  a,  b  is  uniquely  determined  by  the  condition 
0  <  /(a,  b)  7 r.  One  calls  a  vector  a  orthogonal  (perpendicular)  to  b  (in  symbols! 
a  _L  b),  if  Z(a,  b)  =  |.  By  definition,  the  zero  vector  0  is  orthogonal  to  all  vectors. 


Definition  A. 3  Let  a,  b  be  planar  (or  spatial)  vectors.  The  number 


(a,  b)  = 


|| a ||  •  ||b||  •  cos  Z(a,  b)  a  /  0,  b  /  0, 
0  otherwise. 


is  called  the  inner  product  (dot  product)  of  a  and  b. 

For  planar  vectors  a,  b  e  M2  the  inner  product  is  calculated  from  their  compo¬ 
nents  as 


(a,  b)  = 


a\ 

h 

_ai_ 

pl_ 

=  a\b\  + 


For  vectors  a,  b  e  M3  the  analogous  formula  holds: 


1 

a\ 

b\ 

(a,  b)  = 

a2 

b2 

\ 

a3 

h 

=  a\b\  +  <22^2  +  <23^3- 


Example  A.4  The  standard  basis  vectors  e/  have  length  1  and  are  mutually  orthog¬ 
onal,  i.e. 

(ez ,  e2  )  —  i  . 

0,  I  ^  J. 

For  vectors  a,  b,  c  and  a  scalar  AeK  the  inner  product  obeys  the  rules 

(a)  (a,  b>  =  (b,  a), 

(b)  (a,  a)  =  ||a||2, 


(c)  (a, b)  =  0  <s>  alb, 
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(d)  <Aa,b)  =  (a,  Ab)  =  A(a,  b), 

(e)  (a  +  b,  c)  =  (a,  c)  +  (b,  c). 

Example  A.  5  For  the  vectors 


"  2  " 

6 

"  1  " 

a  = 

-4 

,  b  = 

3 

,  c  = 

0 

0 

4 

-1 

we  have 

||a||2  =4+ 16  =  20,  ||b||2  =  36  +  9+ 16  =  61,  ||c||2  =  1  +  1=2, 


and 


(a,  b)  =  12-12  =  0,  (a,  c)  =2. 

From  this  we  conclude  that  a  is  perpendicular  to  b  and 


cos  Z(a,  c)  = 


(a,  c) 


1 


a 


V20V2  yio 


The  value  of  the  angle  between  a  and  c  is  thus 


Z(a,  c)  =  arccos 


1 


=  1.249  rad. 


A.5  The  Outer  Product  (Cross  Product) 


For  vectors  a,  b  in  R2  one  defines 


a  x  b 


Cl2b\  G  M, 


the  cross  product  of  a  and  b.  An  elementary  calculation  shows  that 

|a  x  b|  =  || a ||  •  ||b||  •  sin  Z(a,  b). 


Thus  | a  x  b|  is  the  area  of  the  parallelogram  spanned  by  a  and  b. 
For  vectors  a,  b  G  M3  one  defines  the  cross  product  as 


a\ 

b\ 

a2h  ~  03^2 

a  x  b  = 

a2 

X 

b2 

— 

a2b\  -  aib?, 

_a3_ 

_h_ 

_a\b2  -  a2b\_ 

This  product  has  the  following  geometric  interpretation:  If  a  =  0  or  b  =  0  or  a  =  Ab 
then  a  x  b  =  0.  Otherwise  a  x  b  is  the  vector 
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(a)  which  is  perpendicular  to  a  and  b :  (a  x  b,  a)  =  (a  x  b,  b)  =0; 

(b)  which  is  directed  such  that  a,  b,  a  x  b  forms  a  right-handed  system ; 

(c)  whose  length  is  equal  to  the  area  F  of  the  parallelogram  spanned  by  a  and  b : 
F  =  || a  x  b ||  =  || a ||  •  ||b||  •  sinZ(a,  b). 


Example  A.  6  Let  E  be  the  plane  spanned  by  the  two  vectors 


"  1  " 

V 

a  = 

-1 

and  b  = 

0 

2 

1 

Then 


1 

1 

-1 

a  x  b  = 

-1 

X 

0 

— 

1 

2 

1 

1 

is  a  vector  perpendicular  to  this  plane. 

For  a,  b,  c  e  M3  and  AeM  the  following  rules  apply 

(a)  a  x  a  =  0,  a  x  b  =  —  (b  x  a), 

(b)  A(a  x  b)  =  (Aa)  xb  =  ax  (Ab), 

(c)  (a  +  b)  xc  =  axc  +  bxc. 

However,  the  cross  product  is  not  associative  and 

a  x  (b  x  C)  y^(a  x  b)  x  c 

for  general  a,  b,  c.  For  instance,  the  standard  basis  vectors  of  the  R3  satisfy  the 
following  identities 

ei  x  (ei  x  e2)  =  ei  x  e3  =  — e2, 

(ei  x  ei)  x  e2  =  0  x  e2  =  0. 


A.6  Straight  Lines  in  the  Plane 

The  general  equation  of  a  straight  line  in  the  (x,  y)  -plane  is 


ax  +  by  =  c, 
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where  at  least  one  of  the  coefficients  a  and  b  must  be  different  from  zero.  The  straight 
line  consists  of  all  points  (x,  y)  which  satisfy  the  above  equation, 

g  =  {(x,  y )  e  M2;  ax  +  by  =  c) . 


If  b  =  0  (and  thus  a  ^  0)  we  get 


v  = 


c 


a 


and  thus  a  line  parallel  to  the  y-axis.  If  b  ^  0,  one  can  solve  for  y  and  obtains  the 
standard  form  of  a  straight  line 


a 

y =  ~ux  + 

b 


-  =  kx  +  d 
b 


with  slope  k  and  intercept  d. 

The  parametric  representation  of  the  straight  line  is  obtained  from  the  general 
solution  of  the  linear  equation 

ax  +  by  =  c. 

Since  this  equation  is  underdetermined,  one  replaces  the  independent  variable  by  a 
parameter  and  solves  for  the  other  variable. 


Example  A.  7  In  the  equation 

y  =  kx  +  d 

x  is  considered  as  independent  variable.  One  sets  x  =  A  and  obtains  y  =  k\  +  d  and 
thus  the  parametric  representation 

A  g  R. 


v 

7 


0 

d 


+  A 


Example  A.  8  In  the  equation 

x  =4 

y  is  the  independent  variable  (it  does  not  even  appear).  This  straight  line  in  parametric 
representation  is 


In  general,  the  parametric  representation  of  a  straight  line  is  of  the  form 


P 

q 


A  g  R 
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(position  vector  of  a  point  plus  a  multiple  of  a  direction  vector).  A  vector  perpendic¬ 
ular  to  this  straight  line  is  called  a  normal  vector.  It  is  a  multiple  of 


V 

1 

u 

V 

,  since 

9 

—u 

\ 

V 

—u 

The  conversion  to  the  nonparametric  form  is  obtained  by  multiplying  the  equation 
in  parametric  form  by  a  normal  vector.  Thereby  the  parameter  is  eliminated.  In  the 
example  above  one  obtains 


vx  —  uy  =  pv  —  qu. 

In  particular,  the  coefficients  of  v  and  y  in  the  nonparametric  form  are  just  the 
components  of  a  normal  vector  of  the  straight  line. 


A.7  Planes  in  Space 

The  general  form  of  a  plane  in  R3  is 

ax  +  by  +  cz  =  d, 

where  at  least  one  of  the  coefficients  a,  b,  c  is  different  from  zero.  The  plane  consists 
of  all  points  which  satisfy  the  above  equation,  i.e. 

E  =  {(v,  y,  z)  E  M3;  ax  +  by  +  cz  =  d]  . 

Since  at  least  one  of  the  coefficients  is  nonzero,  one  can  solve  the  equation  for  the 
corresponding  unknown. 

For  example,  if  c  /  0  one  can  solve  for  z  to  obtain 

a  b  d 

Z  = - v - y  H —  =  kx  +  ly  +  e. 

c  c  c 

Here  k  represents  the  slope  in  v -direction,  /  is  the  slope  in  y -direction  and  e  is  the 
intercept  on  the  z-axis  (because  z  =  e  for  x  =  y  =  0).  By  introducing  parameters 
for  the  independent  variables  x  and  y 

x  =  A,  y  =  p,  z  =  k\  +  l/_i  +  e 

one  thus  obtains  the  parametric  representation  of  the  plane: 


X 

0 

1 

0 

y 

— 

0 

+  A 

0 

+  /i 

1 

_z_ 

e 

k 

/ 

A  ,/iGl. 


340 


Appendix  A:  Vector  Algebra 


In  general,  the  parametric  representation  of  a  plane  in  M3  is 


y 

— 

1 

1 _ 

+  A 

Vl 

v2 

+  jl 

W\ 

W2 

_z_ 

r 

_V3_ 

_W2_ 

with  v  x  w  /  0.  If  one  multiplies  this  equation  with  v  x  w  and  uses 

(y,  y  x  w)  =  (w,  v  x  w)  =  0, 
one  again  obtains  the  nonparametric  form 


Example  A.  9  We  compute  the  nonparametric  form  of  the  plane 


X 

3 

"  1  " 

V 

y 

— 

1 

+  A 

-1 

+  (i 

0 

_z_ 

1 

2 

l 

A  normal  vector  to  this  plane  is  given  by 


”  1  " 

V 

~-i" 

y  x  w  = 

-1 

X 

0 

— 

l 

2 

l 

l 

and  thus  the  equation  of  the  plane  is 

-v  +  y  +  z  =  — 1. 


A.8  Straight  Lines  in  Space 

A  straight  line  in  R3  can  be  seen  as  the  intersection  of  two  planes : 

' 

ax  +  by  +  cz  =  d, 

0  *  < 

ex  +  fy  +  gz  —  h. 

k. 

The  straight  line  is  the  set  of  all  points  (x,y,z)  which  fulfil  this  system  of  equations 
(two  equations  in  three  unknowns).  Generically,  the  solution  of  the  above  system 
can  be  parametrised  by  one  parameter  (this  is  the  case  of  a  straight  line).  However, 
it  may  also  happen  that  the  planes  are  parallel.  In  this  situation  they  either  coincide, 
or  they  do  not  intersect  at  all. 
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A  straight  line  can  also  be  represented  parametrically  by  the  position  vector  of  a 
point  and  an  arbitrary  multiple  of  a  direction  vector 


P 

u 

A 

— 

q 

+  A 

V 

_Z_ 

r 

w 

The  direction  vector  is  obtained  as  difference  of  the  position  vectors  of  two  points 
on  the  straight  line. 


Example  A.  10  We  want  to  determine  the  straight  line  through  the  points  P  = 
(1,  2,  0)  and  Q  =  (3,  1,  2).  A  direction  vector  a  of  this  line  is  given  by 


3 

r 

"  2  " 

a  = 

1 

2 

— 

i 

<N  O 

_ 1 

-1 

2 

Thus  a  parametric  representation  of  the  straight  line  is 


X 

V 

"  2  " 

9  • 

y 

_z 

— 

2 

0 

+  A 

-1 

2 

The  conversion  from  parametric  to  nonparametric  form  and  vice  versa  is  achieved 
by  elimination  or  introduction  of  a  parameter  A.  In  the  example  above  one  computes 
z  =  2 A  from  the  last  equation  and  inserts  it  into  the  first  two  equations.  This  yields 
the  nonparametric  form 


x  —  z  =  1, 
2y  +  z  =  4. 


Matrices 


In  this  book  matrix  algebra  is  required  in  multi-dimensional  calculus,  for  systems  of 
differential  equations  and  for  linear  regression.  This  appendix  serves  to  outline  the 
basic  notions.  A  more  detailed  presentation  can  be  found  in  [2]. 


B.1  Matrix  Algebra 

An  ( m  x  n) -matrix  A  is  a  rectangular  scheme  of  the  form 


an  a\2  . . .  a \n 

&21  a22  • • •  &2n 

_dm  1  dm2  •  •  •  dmn_ 


The  entries  (coefficients,  elements )  aij,  i  =  1 ,  ,m,  j  =  1,  . . . ,  n  of  the  matrix  A 

are  real  or  complex  numbers.  In  this  section  we  restrict  ourselves  to  real  numbers. 
An  (m  x  n) -matrix  has  m  rows  and  n  columns;  if  m  =  n,  and  the  matrix  is  called 
square.  Vectors  of  length  m  can  be  understood  as  matrices  with  one  column,  i.e  as 
(m  x  1) -matrices.  In  particular,  one  refers  to  the  columns 


aij 

a2j 


of  a  matrix  A  as  column  vectors  and  accordingly  also  writes 


A  =  [ai  :  a2  :  ...  :  a„] 
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for  the  matrix.  The  rows  of  the  matrix  are  sometimes  called  row  vectors. 

The  product  of  an  (m  x  n) -matrix  A  with  a  vector  x  of  length  n  is  defined  as 


y  =  Ax, 


At 

a\\x\  +  ^12-^2  T  •  •  •  T  a\nxn 

A2 

— 

a2\X\  +  022-^2  +  •  •  •  +  ^2nxn 

_Am  _ 

_am  1 V ]  T  Offi2X2  T  •  •  •  T  Clmn^n_ 

and  results  in  a  vector  y  of  length  m.  The  &th  entry  of  y  is  obtained  by  the  inner 
product  of  the  &th  row  vector  of  the  matrix  A  (written  as  a  column)  with  the  vector 
x. 


Example  B.l  For  instance,  the  product  of  a  (2  x  3) -matrix  with  a  vector  of  length  3 
is  computed  as  follows: 

3a  —  b  +  2c 
3d-e  +  2f  • 


A  = 


a  b  c 
d  e  f 


X  = 


-1 


Ax  = 


The  assignment  x^y  =  Ax  defines  a  linear  mapping  from  W1  to  Mm .  The  lin¬ 
earity  is  characterised  by  the  validity  of  the  relations 

A(u  +  v)  =  Au  +  Av,  A(Au)  =  AAu 

for  all  u,  v  g  W1  and  A  e  R,  which  follow  immediately  from  the  definition  of  matrix 
multiplication.  If  ej  is  the  jth  standard  basis  vector  of  W\  then  obviously 

a/  =  Ae,- . 

This  means  that  the  columns  of  the  matrix  A  are  just  the  images  of  the  standard  basis 
vectors  under  the  linear  mapping  defined  by  A. 

Matrix  arithmetic.  Matrices  of  the  same  format  can  be  added  and  subtracted  by 
adding  or  subtracting  their  components.  Multiplication  with  a  number  A  e  R  is  also 
defined  componentwise.  The  transpose  AT  of  a  matrix  A  is  obtained  by  swapping 
rows  and  columns;  i.e.  the  i th  row  of  the  matrix  AT  consists  of  the  elements  of  the 
i th  column  of  A: 


a\\  a\2  . . .  a\n  " 

a\\  021  •  •  •  <2ml 

A  = 

<221  <222  •  •  •  <22 n 

,  at  = 

<212  <222  •  •  •  <2m2 

_am\  am2  . . .  a  Dm  _ 

_a\n  a2n  •  •  •  <2 mn_ 

By  transposition  an  (m  x  /i)-matrix  becomes  an  (n  x  m) -matrix.  In  particular,  trans¬ 
position  changes  a  column  vector  into  a  row  vector  and  vice  versa. 
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Example  B.2  For  the  matrix  A  and  the  vector  x  from  Example  B.l  we  have: 


a  d 
b  e 

c  f 


xT  =  [3  —12],  X  =  [3  -1  2]t 


If  a,  b  are  vectors  of  length  n,  then  one  can  regard  aT  as  a  (1  x  n)-matrix.  Its 
product  with  the  vector  b  is  defined  as  above  and  coincides  with  the  inner  product: 

n 

aTb  =  ^2  aibi  =  (a,  b). 

1=1 

More  generally,  the  product  of  an  ( m  x  n)  -matrix  A  with  an  ( n  x  /) -matrix  B  can  be 
defined  by  forming  the  inner  products  of  the  row  vectors  of  A  with  the  column  vectors 
of  B.  This  means  that  the  element  Cij  in  the  i  th  row  and  jth  column  of  C  =  AB  is 
obtained  by  inner  multiplication  of  the  i  th  row  of  A  with  the  j  th  column  of  B: 

n 

Cij  =  ^  '  djkbkj  • 
k=  1 

The  result  is  an  (m  x  /) -matrix.  The  product  is  only  defined  if  the  dimensions  match, 
i.e.  if  the  number  of  columns  n  of  A  is  equal  to  the  number  of  rows  of  B.  The  matrix 
product  corresponds  to  the  composition  of  linear  mappings.  If  B  is  the  matrix  of 
a  linear  mapping  M/  — >  and  A  the  matrix  of  a  linear  mapping  Wl  —>  Mm ,  then 
AB  is  just  the  matrix  of  the  composition  of  the  two  mappings  M,1  — >  Wl  — >  Mm .  The 
transposition  of  the  product  is  given  by  the  formula 

(AB)t  =  btat, 

which  can  easily  be  deduced  from  the  definitions. 


Square  matrices.  The  entries  an,  <222,  • . . ,  ann  of  an  (n  x  n) -matrix  A  are  called 
the  diagonal  elements.  A  square  matrix  D  is  called  diagonal  matrix ,  if  its  entries  are 
all  zero  with  the  possible  exception  of  the  diagonal  elements.  Special  cases  are  the 
zero  matrix  and  the  unit  matrix  of  dimension  n  x  n: 


"0  0  ...  0" 

1 

1— 1 

0 

0 

_ 1 

0  0  ...  0 

0  1  ...  0 

0  = 

•  •  •  • 

,  1  = 

•  •  •  • 

_0  0  ...  0_ 

1 

•  H 

•  0 

•  0 
_ 1 

The  unit  matrix  is  the  identity  with  respect  to  matrix  multiplication.  For  all  (n  x  n)- 
matrices  A  it  holds  that  I A  =  AI  =  A.  If  for  a  given  matrix  A  there  exists  a  matrix 
B  with  the  property 


BA  =  AB  =  I, 
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then  one  calls  A  invertible  or  regular  and  B  the  inverse  of  A,  denoted  by 

B  =  A-1. 

Let  x  e  W\  A  an  invertible  ( n  x  0)-matrix  and  y  =  Ax.  Then  x  can  be  computed 
as  x  =  A- ly;  in  particular,  A-1  Ax  =  x  and  AA_1y  =  y.  This  shows  that  the  linear 
mapping  W1  —>  W1  induced  by  the  matrix  A  is  bijective  and  A-1  represents  the 
inverse  mapping.  The  bijectivity  of  A  can  be  expressed  in  yet  another  way.  It  means 
that  for  every  y  e  W1  there  is  one  and  only  one  x  e  W1  such  that 

a\\x\  +  012*2  T  •  •  •  T  a\nxn  =  yi, 

021*1  H"  022*2  +  .  .  .  +  a2 nxn  —  yi-> 

Ax  =  y,  or 

0ml*l  T  0ra2*2  T-  •  •  •  T  Clmn^n  =  yn  • 

The  latter  can  be  considered  as  a  linear  system  of  equations  with  right-hand  side  y 
and  solution  x  =  [x\  X2  ...  xn]T .  In  other  words,  the  invertibility  of  a  matrix  A  is 
equivalent  with  the  bijectivity  of  the  corresponding  linear  mapping  and  equivalent 
with  the  unique  solvability  of  the  corresponding  linear  system  of  equations  (for 
arbitrary  right-hand  sides). 

For  the  remainder  of  this  appendix  we  restrict  our  attention  to  (2  x  2) -matrices. 
Let  A  be  a  (2  x  2) -matrix  with  the  corresponding  system  of  equations: 

011*1  +  012*2  =  yi, 

021*1  +  022*2  =  yi- 

An  important  role  is  played  by  the  determinant  of  the  matrix  A.  In  the  (2  x  2) -case 
it  is  defined  as  the  cross  product  of  the  column  vectors: 

det  A  =  ai  x  a2  =  011022  —  021012- 

Since  ai  x  a2  =  ||ai  ||  ||a2||  sin  Z(ai,  a2),  the  column  vectors  ai,a2  are  linearly 
dependent  (so — in  R2  — multiples  of  each  other),  if  and  only  if  det  A  =  0.  The  fol¬ 
lowing  theorem  characterises  invertibility  in  the  (2  x  2) -case  completely. 

Proposition  B.3  For  (2  x  2) -matrices  A  the  following  statements  are  equivalent: 

(a)  A  is  invertible. 

(b)  The  linear  mapping  M2  ->  R2  defined  by  A  is  bijective. 

(c)  The  linear  system  of  equations  Ax  =  y  has  a  unique  solution  x  e  M?for  arbitrary 
right-hand  sides  y  e  M2. 

(d)  The  column  vectors  of  A  are  linearly  independent. 

(e)  The  linear  mapping  M2  ->  M2  defined  by  A  is  injective. 


A  =  [ai  :  a2]  = 


011  012 
021  022 
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(f)  The  only  solution  of  the  linear  system  of  equations  Ax  =  0  is  the  zero  solution 
x  =  0. 

(g)  detA  £  0. 

Proof  The  equivalence  of  the  statements  (a),  (b)  and  (c)  was  already  observed  above. 
The  equivalence  of  (d),  (e)  and  (f)  can  easily  be  seen  by  negation.  Indeed,  if  the 
column  vectors  are  linearly  dependent,  then  there  exists  x  =  [x\  xz]J  /  0  with 
viai  +  X2&2  =  0.  On  the  one  hand,  this  means  that  the  vector  x  is  mapped  to  0 
by  A;  thus  this  mapping  is  not  injective.  On  the  other  hand,  x  is  a  nontrivial  solution 
of  the  linear  system  of  equations  Ax  =  0.  The  converse  implications  are  shown  in 
the  same  way.  Thus  (d),  (e)  and  (f)  are  equivalent.  The  equivalence  of  (g)  and  (d)  is 
obvious  from  the  geometric  meaning  of  the  determinant.  If  the  determinant  does  not 
vanish  then 

^-1  _ \ _  <322  -<312 

<3ll<322  -  <321<312  \_~a2l  aU 

is  an  inverse  to  A,  as  can  be  verified  at  once.  Thus  (g)  implies  (a).  Finally,  (e) 
obviously  follows  from  (b).  Hence  all  statements  (a)-(g)  are  equivalent.  □ 

Proposition B. 3  holds  for  matrices  of  arbitrary  dimension  n  x  n.  For  n  =  3  one 
can  still  use  geometrical  arguments.  The  cross  product,  however,  has  to  be  replaced  by 
the  triple  product  (ai  x  a2,  a3)  of  the  three  column  vectors,  which  then  also  defines 
the  determinant  of  the  (3  x  3)-matrix  A.  In  higher  dimensions  the  proof  requires 
tools  from  combinatorics,  for  which  we  refer  to  the  literature. 


B.2  Canonical  Form  of  Matrices 

In  this  subsection  we  will  show  that  every  (2  x  2) -matrix  A  is  similar  to  a  matrix 
of  standard  type,  which  means  that  it  can  be  put  into  standard  form  by  a  basis 
transformation.  We  need  this  fact  in  Sect.  20.1  for  the  classification  and  solution  of 
systems  of  differential  equations.  The  transformation  explained  below  is  a  special 
case  of  the  Jordan  canonical  form1  for  ( n  x  n) -matrices. 

If  T  is  an  invertible  (2  x  2) -matrix,  then  the  columns  ti,  t2  form  a  basis  of  M2. 
This  means  that  every  element  xg!2  can  be  written  in  a  unique  way  as  a  linear 
combination  c\t\  +  C2t2;  the  coefficients  c\,  cz  €  R  are  the  coordinates  of  x  with 
respect  to  ti  and  t2.  One  can  regard  T  as  a  linear  transformation  of  M2  which  maps 
the  standard  basis  {[1  0]T,  [0  1]T}  to  the  basis  {ti,  t2}. 

Definition  B.4  Two  matrices  A,  B  are  called  similar ,  if  there  exists  an  invertible 
matrix  T  such  that  T_1  AT  =  B. 


lC.  Jordan,  1838-1922. 
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The  three  standard  types  which  will  define  the  similarity  classes  of  (2  x  2)- 
matrices  are  of  the  following  form: 


type  I 

type  II 

type  III 

"Ai  O' 
.0  A2_ 

"a  r 

0  A 

p  —v 

V  p 

Here  the  coefficients  Ai,  A2,  A,  p,  v  are  real  numbers. 

In  what  follows,  we  need  the  notion  of  eigenvalues  and  eigenvectors.  If  the  equa¬ 
tion 


Av  =  Av 

has  a  solution  v  7^  0  e  M2  for  some  AgM,  then  A  is  called  eigenvalue  and  v  eigen¬ 
vector  of  A.  In  other  words,  v  is  the  solution  of  the  equation 

(A  -  AI)v  =  0, 

where  I  denotes  again  the  unit  matrix.  For  the  existence  of  a  nonzero  solution  v  it  is 
necessary  and  sufficient  that  the  matrix  A  —  AI  is  not  invertible,  i.e. 

det(A  -  AI)  =  0. 


By  writing 


b 

d 


we  see  that  A  has  to  be  a  solution  of  the  characteristic  equation 


det 


a  —  A 
c 


b 

d- A 


( a  +  d)  A  +  ad  —  be  —  0. 


If  this  equation  has  a  real  solution  A,  then  the  system  of  equations  (A  —  AI)v  =  0  is 
underdetermined  and  thus  has  a  nonzero  solution  v  =  [rq  V2]T .  Hence  one  obtains 
the  eigenvectors  to  the  eigenvalue  A  by  solving  the  linear  system 


(a  —  A)  rq  +  b  V2  =  0, 
c  v\  +  (d  —  A)  V2  =  0. 


Depending  on  whether  the  characteristic  equation  has  two  real,  a  double  real  or  two 
complex  conjugate  solutions,  we  obtain  one  of  the  three  similarity  classes  of  A. 


Proposition  B.5  Every  (2  x  2) -matrix  A  is  similar  to  a  matrix  of  type  I,  II  or  III. 
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Proof  (1)  The  case  of  two  distinct  real  eigenvalues  Ai  /=  A2.  With 


vi 


vu 

V21 


V2 


V\2 

V22 


we  denote  the  corresponding  eigenvectors.  They  are  linearly  independent  and  thus 
form  a  basis  of  the  M2 .  Otherwise  they  would  be  multiples  of  each  other  and  so  c\\  = 
V2  for  some  nonzero  cel.  Applying  A  would  result  in  c\\  \\  =  A2V2  =  A2CV1  and 
thus  Ai  =  A2  in  contradiction  to  the  hypothesis.  According  to  Proposition  B.3  the 
matrix 


is  invertible.  Using 


T  =  [vi  :  V2] 


vn  vu 
V21  ^22 


Avi  =  Aivi,  Av2  =  A2V2, 


we  obtain  the  identities 


T  lAT  =  T  1A[vi:v2]  =  T  1[Aivi:A2V2] 


1 

<N 

1 

<N 

<N 

1 _ 

Alt’ll  A2fi2 

'Ai  O' 

-  V21VU 

_—V2i  vn 

Ai  1921  X2V22 

1 

<N 

O 

_ 1 

The  matrix  A  is  similar  to  a  diagonal  matrix  and  thus  of  type  I. 
(2)  The  case  of  a  double  real  eigenvalue  A  =  Ai  =  A2.  Since 


A  =  -  (a  +  d  ±  V  (a  —  d)2  +  4 bc^j 


is  the  solution  of  the  characteristic  equation,  this  case  occurs  if 

9  ,  1 

(1 a  —  d)  =  —4 be,  A  =  -{a  +  d) . 

2 

If  b  =  0  and  c  =  0,  then  a  =  d  and  A  is  already  a  diagonal  matrix  of  the  form 


A  = 


a  0 
0  a 


thus  of  type  I.  If  b  ^  0,  we  compute  c  from  (a  —  d)2  =  —4 be  and  find 


A  —  AI  = 


a  —  A  b 

i  (a  —  d)  b 

c  d  —  A 

Note  that 


a  —  d) 

b 

—  d) 

b 

"0  0" 

<N 

1 

1 

_ 1 

—  \{ci  —  d) 

<N 

1 

hi- 

1 

_ 1 

—  \(a  —  d) 

0  0 
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or  (A  -  AI)2  =  O.  In  this  case,  A  —  AI  is  called  a  nilpotent  matrix.  A  similar  cal¬ 
culation  shows  that  (A  —  AI)2  =  O  if  c  /  0.  We  now  choose  a  vector  V2  e  M2  for 
which  (A  —  AI)v2  7^  0.  Due  to  the  above  consideration  this  vector  satisfies 

(A  -  AI)2v2  =  0. 


If  we  set 


vi  =  (A  —  AI)v2, 

then  obviously 

Avi  =  Avi,  Av2  =  vi  +  Av2. 

Further  vi  and  V2  are  linearly  independent  (because  if  vi  were  a  multiple  of  V2,  then 
Av2  =  Av2  in  contradiction  to  the  construction  of  V2).  We  set 


T  =  [vi  :  V2]. 


The  computation 


T_1AT  =  T-1[Avi !  vi  +  Av2] 


1 

^11^22  -  V21  V\2 


V22 

-Vn 

_-V2l 

vn  _ 

Xvn 

\V2l 


vn  +  Xvn 
V21  +  XV22 


X  1 
0  A 


shows  that  A  is  similar  to  a  matrix  of  type  II. 

(3)  The  case  of  complex  conjugate  solutions  Ai  =  p  +  iu,  X2  =  p  —  iv.  This  case 
arises  if  the  discriminant  (a  —  d )2  +  4 be  is  negative.  The  most  elegant  way  to  deal 
with  this  case  is  to  switch  to  complex  variables  and  to  perform  the  computations  in 
the  complex  vector  space  C2.  We  first  determine  complex  vectors  vi ,  V2  €  C2  such 
that 


Avi=AiVi,  Av2  =  A2V2 

and  then  decompose  vi  =  f  +  ig  into  real  and  imaginary  parts  with  vectors  f,  g  in 
R2.  Since  Ai  =  /t  +  iv,  —  /t  —  i v,  it  follows  that 

v2  =  f  -  ig. 

Note  that  {vi ,  v2}  forms  a  basis  of  C2.  Thus  {g,  f }  is  a  basis  of  M2  and 
A(f  +  ig)  =  (p  +  iv)(i  +  ig)  =  /.tf  -  vg  +  i(vi  +  /xg), 


consequently 


Again  we  set 


Ag  =  vt  +  /xg,  Af  =  /tf  -  vg. 


T  =  [g  !  f]  = 


9 1  /l 

92  h 
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from  which  we  deduce 


T  1  AT  =  T  1  [isf  -\-  fig:  fjf  —  vg] 


1 

1 

1 _ 

-52/1 

-92  91 

vfi  +  Ml  nfi  -  vgi 
vfi  +  M2  At/ 2  -  vg2 


1 

1 

_ 1 

1 

1 _ 

Thus  A  is  similar  to  a  matrix  of  type  III. 


Further  Results  on  Continuity 


This  appendix  covers  further  material  on  continuity  which  is  not  central  for  this  book 
but  on  the  other  hand  is  required  in  various  proofs  (like  in  the  chapters  on  curves 
and  differential  equations).  It  includes  assertions  about  the  continuity  of  the  inverse 
function,  the  concept  of  uniform  convergence  of  sequences  of  functions,  the  power 
series  expansion  of  the  exponential  function  and  the  notions  of  uniform  and  Lipschitz 
continuity. 


C.1  Continuity  of  the  Inverse  Function 

We  consider  a  real- valued  function  /  defined  on  an  interval  /  c  R.  The  interval  I  can 
be  open,  half-open  or  closed.  By  /  =  /(/)  we  denote  the  image  of  /.  First,  we  show 
that  a  continuous  function  /:/—>/  is  bijective,  if  and  only  if  it  is  strictly  mono- 
tonically  increasing  or  decreasing.  Monotonicity  was  introduced  in  Definition  8.5. 
Subsequently,  we  show  that  the  inverse  function  is  continuous  if  /  is  continuous, 
and  we  describe  the  respective  ranges. 

Proposition  C.l  A  real-valued,  continuous  function  f  :  I  —>  /  =  /(/)  is  bijective 
if  and  only  if  it  is  strictly  monotonically  increasing  or  decreasing. 

Proof  We  already  know  that  the  function  /:/—>/(/)  is  surjective.  It  is  injective 
if  and  only  if 

=>  /(*l)#/(*2)- 

Strict  monotonicity  thus  implies  injectivity.  To  prove  the  converse  implication  we 
start  by  choosing  two  points  x\  <  X2  £  /.  Let  f(x  1)  <  f(x 2),  for  example.  We  will 
show  that  /  is  strictly  monotonically  increasing  on  the  entire  interval  I .  First  we 
observe  that  for  every  V3  6  (x\,  xf)  we  must  have  f(x  1)  <  f(x 3)  <  f(x 2).  This 
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is  shown  by  contradiction.  Assuming  f(x 3)  >  f(x2),  Proposition  6.14  implies  that 
every  intermediate  point  f(x 2)  <  rj  <  fix 3)  would  be  the  image  of  a  point  £1  e 
(x\,  V3)  and  also  the  image  of  a  point  £2  £  (*3,  X2),  contradicting  injectivity. 

If  we  now  choose  X4  e  I  such  that  X2  <  X4,  then  once  again  f(x 2)  <  f(x 4).  Oth¬ 
erwise  we  would  have  x\  <  X2  <  V4with/(v2)  >  /(A4);  this  possibility  is  excluded 
as  in  the  previous  case.  Finally,  the  points  to  the  left  of  x\  are  inspected  in  a  simi¬ 
lar  way.  It  follows  that  /  is  strictly  monotonically  increasing  on  the  entire  interval 
I.  In  the  case  fix  1)  >  fix 2),  one  can  deduce  similarly  that  /  is  monotonically 
decreasing.  □ 

The  function  y  =  x  •  1L  (_i  0]  (-^)  +  (1  —  x)  •  l(o,i)CO,  where  1/  denotes  the  indi¬ 
cator  function  of  the  interval  I  (see  Sect.  2.2),  shows  that  a  discontinuous  function 
can  be  bijective  on  an  interval  without  being  strictly  monotonically  increasing  or 
decreasing. 

Remark  C.2  If  I  is  an  open  interval  and  /:/—>/  a  continuous  and  bijective  func¬ 
tion,  then  /  is  an  open  interval  as  well.  Indeed,  if  /  were  of  the  form  [a,  b),  then 
a  would  arise  as  function  value  of  a  point  x\  €  /,  i.e.  a  =  fix  1).  However,  since  I 
is  open,  there  are  points  X2  e  I,  x  2  <  x\  and  V3  el  with  V3  >  x\.  If  /  is  strictly 
monotonically  increasing  then  we  would  have  f(x 2)  <  fix  1)  —  a.  If  /  is  strictly 
monotonically  decreasing  then  f(x 3)  <  fix  1)  =  a.  Both  cases  contradict  the  fact 
that  a  was  assumed  to  be  the  lower  boundary  of  the  image  /  =  /(/).  In  the  same 
way,  one  excludes  the  possibilities  that  /  =  (a,  b]  or  /  =  [a,  b]. 

Proposition  C.3  Let  I  C  R  be  an  open  interval  and  /:/—>/  continuous  and 
bijective.  Then  the  inverse  function  f~l  :  /  — >  I  is  continuous  as  well. 

Proof  We  take  x  e  I ,  y  e  J  with  y  =  /(jt),  v  =  /-1(y).  For  small  e  >  0  the  e- 
neighbourhood  U£(x)  of  v  is  contained  in  I.  According  to  Remark  C.2  f(U£(x))  is 
an  open  interval  and  therefore  contains  a  (^-neighbourhood  U$(y)  of  y  for  a  certain 
S  >  0.  Consider  a  sequence  of  values  yn  E  J  which  converges  to  y  as  n  ->  00.  Then 
there  is  an  index  n(S)  e  N  such  that  all  elements  of  the  sequence  yn  with  n  >  n(S)  lie 
in  the  (^-neighbourhood  U§(y).  That,  however,  means  that  the  values  of  the  function 
f~l(yn)  from  n(S)  onwards  lie  in  the  ^-neighbourhood  U£(x)  of  x  =  f~l(y).  Thus 
lim^oo  Z-1^)  =  /-1(y)  which  is  the  continuity  of  f~l  at  y.  □ 


C.2  Limits  of  Sequences  of  Functions 

We  consider  a  sequence  of  functions  fn\I—>  R,  defined  on  an  interval  I  C  R.  If 
the  function  values  fn  (x)  converge  for  every  fixed  v  6  /,  then  the  sequence  (fn)n>\ 
is  called  pointwise  convergent.  The  pointwise  limits  define  a  function  /  :  I  —>  R  by 
fix)  =  lim^^oo  fn(x ),  the  so-called  limit  function. 
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Example  C.4  Let  I  =  [0,  l]and  fn(x)  =  xLThenlim^oo  fn(x)  =  OifO  <  v  <  1, 
and  lim^oo  fn(  1)  =  1.  The  limit  function  is  thus  the  function 


fix)  = 


0,  0  <  *  <  1, 

1,  x  —  \ . 


This  example  shows  that  the  limit  function  of  a  pointwise  convergent  sequence  of 
continuous  functions  is  not  necessarily  continuous. 


Definition  C. 5  ( Uniform  convergence  of  sequences  of  functions)  A  sequence  of 
functions  (fn)n>\  defined  on  an  interval  I  is  called  uniformly  convergent  with  limit 
function  /,  if 

Ve  >  0  3n(s)  e  N  Vn  >  n(e)  Vi  g  /  :  |  f(x)  —  fn(x)\  <  s. 


Uniform  convergence  means  that  the  index  n(e)  after  which  the  sequence  of 
function  values  (fn(x))n> i  settles  in  the  ^-neighbourhood  U£(f(x))  can  be  chosen 
independently  ofx  e  I . 

Proposition  C.6  The  limit  function  f  of  a  uniformly  convergent  sequence  of  func¬ 
tions  (fn)n> l  A  continuous. 

Proof  We  take  x  e  I  and  a  sequence  of  points  Xk  converging  to  v  as  k  — >  oo.  We 
have  to  show  that  f(x)  =  lim^^oo  f(xk).  For  this  we  write 

fix )  -  fiXk)  =  (fix)  -  fnix ))  +  (fnix)  ~  fniXk))  +  ( fniXk )  “  f(Xk)) 

and  choose  e  >  0.  Due  to  the  uniform  convergence  it  is  possible  to  find  an  index 
n  e  N  such  that 


I  fix)  -  fnix)  I  <  |  and  \fnixk)  -  fixk)\  <  | 
for  all  k  e  N.  Since  fn  is  continuous,  there  is  an  index  k(e)  e  N  such  that 

I  fnix)  -  fniXk)  I  <  | 

for  all  k  >  k(e).  For  such  indices  k  we  have 

I  fix)  —  f  (xk)  |  <-  +  -  +  -=£• 

Thus  f(xk)  ->  /(jc)  as  k  ->  oo,  which  implies  the  continuity  of  /.  □ 
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Application  C.7  The  exponential  function  f(x)  =  ax  is  continuous  on  R.  In  Appli¬ 
cation  5. 14  it  was  shown  that  the  exponential  function  with  base  a  >  0  can  be  defined 
for  every  v  e  R  as  a  limit.  Let  rn  (x)  denote  the  decimal  representation  of  x,  truncated 
at  the  nth  decimal  place.  Then 

rn(x)  <  x  <  rn(x)  +  10"n. 

The  value  of  rn(x)  is  the  same  for  all  real  numbers  x,  which  coincide  up  to  the 
nth  decimal  place.  Thus  the  mapping  x  i->  rn(x)  is  a  step  function  with  jumps  at  a 
distance  of  10-77.  We  define  the  function  fn(x)  by  linear  interpolation  between  the 
points 

(rn(x),  ar"(x))  and  (rn(x)  +  I0~n ,  ar''(x)+10~"  ) , 

which  means 


fn(x)  =  ar"(x)  +  1  r"n(x)  (ar"(x)+10~"  -  ar"(x . 

The  graph  of  the  function  fn(x)  is  a  polygonal  chain  (with  kinks  at  the  distance  of 
10-71),  and  thus  fn  is  continuous.  We  show  that  the  sequence  of  functions  (fn)n>\ 
converges  uniformly  to  /  on  every  interval  [-T,  T],  0  <  T  e  Q.  Since  v  —  rn(x )  < 
10-77,  it  follows  that 


\f(x)-Mx)\  < 


ax  _ar„(x) 


+ 


a 


rn  O)  +  10 


—n 


—  a 


rn(x) 


For  x  g  \—T,  T]  we  have 

ax  -  ar" (jr)  =  ar"(x)(ax-r"(x)  -  l)  <  /  (a10""  -  l) 

and  likewise 

arn(x)+ 10-"  _ar„(x)  <  aT(a  10-  _  J) 

Consequently 

I  fix)  -  fn(x)  I  <  2  aT  (  107a  -  l), 

and  the  term  on  the  right-hand  side  converges  to  zero  independently  of  x,  as  was 
proven  in  Application  5.15. 


The  rules  of  calculation  for  real  exponents  can  now  also  be  derived  by  taking 
limits.  Take,  for  example,  r,  s  e  R  with  decimal  approximations  (rn)n> i,  (sn)n> i. 
Then  Proposition  5.7  and  the  continuity  of  the  exponential  function  imply 

ar as  =  lim  (ar,1aSn)  =  lim  (arn+Sn)  =  arJrS . 

n^o o  v  7  n^oo  x  7 

With  the  help  of  Proposition  C.3  the  continuity  of  the  logarithm  follows  as  well. 
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C.3  The  Exponential  Series 

The  aim  of  this  section  is  to  derive  the  series  representation  of  the  exponential  function 

OO  fYi 

X 

r, 

m= 0 

by  using  exclusively  the  theory  of  convergent  series  without  resorting  to  differential 
calculus.  This  is  important  for  our  exposition  because  the  differentiability  of  the 
exponential  function  is  proven  with  the  help  of  the  series  representation  in  Sect.  7.2. 

As  a  tool  we  need  two  supplements  to  the  theory  of  series:  Absolute  convergence 
and  Cauchy’s  formula  for  the  product  of  two  series. 

Definition  C.8  A  series  YlkL oak  *s  called  absolutely  convergent ,  if  the  series 
\ak  \  °f  absolute  values  of  its  coefficients  converges. 

Proposition  C.9  Every  absolutely  convergent  series  is  convergent. 

Proof  We  define  the  positive  and  the  negative  parts  of  the  coefficient  ak  by 

0,  ak  >  0, 

\ak\,  ak  <  0. 

Obviously,  we  have  0  <  a£  <  \ak  \  and  0  <  af  <  \ak  | .  Thus  the  two  series  at 

and  Ylb=o  ak  converge  due  to  the  comparison  criterion  (Proposition  5.21)  and  the 
limit 


n  n  n 


k=0  k=0  k=0 


exists.  Consequently,  the  series  ak  converges.  □ 

We  consider  two  absolutely  convergent  series  ai  and  J2JLo  bj  and  ask  how 
their  product  can  be  computed.  Term-by-term  multiplication  of  the  nth  partial  sums 
suggests  to  consider  the  following  scheme: 


a+  = 


ak,  ak  >  0, 
0,  ak  <  0, 


ak  = 


a0b0 

a0b\ 

7 

o 

53 

Clobn 

aib0 

a\b\ 

. . .  a\bn-\ 

a\bn 

« — i  b{) 

an-\b\ 

. . .  an—\bn—\ 

d n—\bn 

a„bo 

anb\ 

. . .  anbn-\ 

anbn 

2A.L.  Cauchy,  1789-1857. 
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Adding  all  entries  of  the  quadratic  scheme  one  obtains  the  product  of  the  partial  sums 

n  n 

J2  hi  ■ 

i=  0  7=0 

In  contrast,  adding  only  the  upper  triangle  containing  the  bold  entries  (diagonal  by 
diagonal),  one  obtains  the  so-called  Cauchy  product  formula 

n  /  m 

sn  —  y '  I  y  \tikbm—k 

m=  0  \£=0 

We  want  to  show  that,  for  absolutely  convergent  series,  the  limits  are  equal: 

lim  Pn  =  lim  Sn. 

n^oo  n^oo 

Proposition  C.10  (Cauchy  product)  If  the  series  ai  and  bj  converge 
absolutely  then 

oo  oo 

jC  at  yy  bj  = 

i  =0  7=0 

The  series  defined  by  the  Cauchy  product  formula  also  converges  absolutely. 


m= 0 


Proof  We  set 


Cm 


and  obtain  that  the  partial  sums 


m 

y  ^  aj^bjn—k 
k= 0 


n  n 


m= 0  /  =0 


n 


J2  \hi 

j= o 


OO  OO 

ai\  yy\bj 

i  =0  7=0 


remain  bounded.  This  follows  from  the  facts  that  the  triangle  in  the  scheme  above  has 
fewer  entries  than  the  square  and  the  original  series  converge  absolutely.  Obviously 
the  sequence  Tn  is  also  monotonically  increasing;  according  to  Proposition  5.10  it 
thus  has  a  limit.  This  means  that  the  series  Em=0  On  converges  absolutely,  so  the 
Cauchy  product  exists.  It  remains  to  be  shown  that  it  coincides  with  the  product  of 
the  series.  For  the  partial  sums,  we  have 


Pn  $n 

— 

n  n  n 

T:  ai  ^2  bj  ~  ^2 Cm 

< 

oo 

y  ^  cm 

o 

II 

o 

II 

o 

II 

•  ^ 

m=n+ 1 

since  the  difference  can  obviously  be  approximated  by  the  sum  of  the  terms  below 
the  ftth  diagonal.  The  latter  sum,  however,  is  just  the  difference  of  the  partial  sum 
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Sn  and  the  value  of  the  series  J2m=o  c ™  •  It  thus  converges  to  zero  and  the  desired 
assertion  is  proven.  □ 


Let 

°°  ym  n  rm 

E(x)  =  E  ~i>  £»(*)  =  E  -r- 

ml  '  ml 

m= 0  m= 0 

The  convergence  of  the  series  for  v  =  1  was  shown  in  Example  5.24  and  for  v  =  2 
in  Exercise  14  of  Chap.  5.  The  absolute  convergence  for  arbitrary  xgR  can  either 
be  shown  analogously  or  by  using  the  ratio  test  (Exercise  15  in  Chap.  5).  If  jv  varies 
in  a  bounded  interval  I  =  [— R,  R],  then  the  sequence  of  the  partial  sums  En(x) 
converges  uniformly  to  E(x),  due  to  the  uniform  estimate 


E(x)  -  En(x) 


E 

m  —n  + 1 


m 


ml 


oo 

*  E 

m=n+ 1 


Rm 

ml 


0 


on  the  interval  [-R,  R].  Proposition  C.6  implies  that  the  function  x  i->  E(x)  is 
continuous. 

For  the  derivation  of  the  product  formula  E(x)E(y)  =  E(x  +  y)  we  recall  the 
binomial  formula: 


{x  +  y)m 


xkym~k 


with 


ml 

kl(m  —  k)l  ’ 


valid  for  arbitrary  x,  y  e  R  and  n  e  N,  see,  for  instance,  [17,  Chap.  XIII,  Theo¬ 
rem?^]. 


Proposition  C.ll  For  arbitrary  x,y  Gl/i  holds  that 


OO/OO  /  oo 

E^El7  =  E 


(*  +  y)m 


r,  1  '  n  /I 

i= 0  7=0  J 


m=  0 


ml 


Proof  Due  to  the  absolute  convergence  of  the  above  series,  Proposition  C.10  yields 


00/00  / 
yj 


r_  _ 

il  ^  jl  ~  ^  ^  kl  (m  —  k)l 

i= 0  7=0  m=0  k=0 


00  m  Yk  ,.m—k 
a  y 


An  application  of  the  binomial  formula 


V  LELE  =  ly  (m )  xky>n-k  =  l(x  +  y)» 

/:!  (m  —  k)l  ml  ^  \k  J  ml 

k= 0  v  7  k= 0  v  7 


shows  the  desired  assertion. 


□ 
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Proposition  C.12  (Series  representation  of  the  exponential  function)  The  exponen¬ 
tial  function  possesses  the  series  representation 


Ex 

ml  ’ 

m= 0 


valid  for  arbitrary  xgR 


Proof  By  definition  of  the  number  e  (see  Example  5.24)  we  obviously  have 

e°  =  l  =  £(0),  e1=e=£(l). 

From  Proposition  C.l  1  we  get  in  particular 

e2  =  e1+1  =  eV  =  E{l)E(\)  =  E(  1  +  1)  =  E(  2) 


and  recursively 


em  =  E(m)  for  me  N. 


The  relation  E(m)E(—m)  —  E(m  —  m)  =  E(0)  =  1  shows  that 


e-m  =  E  =  —E  =  E(—m). 
em  E(m) 


Likewise,  one  concludes  from  (E(\/ri))n  =  E(  1)  that 

e1/n  =  C&=  </£(!)  =  E(\/n). 


So  far  this  shows  that  eA  =  E(x)  holds  for  all  rational  v  =  m/n.  From  Application 
C.7  we  know  that  the  exponential  function  v  h->  e1  is  continuous.  The  continuity 
of  the  function  x  \-^  E(x)  was  shown  above.  But  two  continuous  functions  which 
coincide  for  all  rational  numbers  are  equal.  More  precisely,  if  v  c  R  and  xj  is  the 
decimal  expansion  of  v  truncated  at  the  j  th  place,  then 

ex  =  lim  eXj  —  lim  E (xj )  =  E(x), 

j^o o  j^o o 

which  is  the  desired  result.  □ 


Remark  C.l 3  The  rigorous  introduction  of  the  exponential  function  is  surprisingly 
involved  and  is  handled  differently  by  different  authors.  The  total  effort,  however,  is 
approximately  the  same  in  all  approaches.  We  took  the  following  route:  Introduction 
of  Euler’s  number  e  as  the  value  of  a  convergent  series  (Example  5.24);  definition 
of  the  exponential  function  v  i->  e1  for  x  eR  by  using  the  completeness  of  the 
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real  numbers  (Application  5.14);  continuity  of  the  exponential  function  based  on 
uniform  convergence  (Application  C.7);  series  representation  (Proposition  C.12); 
differentiability  and  calculation  of  the  derivative  (Sect.  7.2).  Finally,  in  the  course 
of  the  computation  of  the  derivative  we  also  obtained  the  well-known  formula  e  = 
lim^oo  (1  +  1  /n)n,  which  Euler  himself  used  to  define  the  number  e. 


C.4  Lipschitz  Continuity  and  Uniform  Continuity 

Some  results  on  curves  and  differential  equations  require  more  refined  continuity 
properties.  More  precisely,  methods  for  quantifying  how  the  function  values  change 
in  dependence  on  the  arguments  are  needed. 

Definition  C.14  A  function  /  :  D  c  R  ->  R  is  called  Lipschitz  continuous ,  if  there 
exists  a  constant  L  >  0  such  that  the  inequality 

l/(*i)  -  /(* 2)1  <  L\x\  -X2\ 

holds  for  all  x\ ,  X2  c  D.  In  this  case  L  is  called  a  Lipschitz  constant  of  the  function  /. 

If  x  e  D  and  (jcw)w>i  is  a  sequence  of  points  in  D  which  converges  to  x, 
the  inequality  \f(x)  —  f(xn)\  <  L\x  —  xn  \  implies  that  f(xn)  ->  fix)  as  n  ->  oo. 
Every  Lipschitz  continuous  function  is  thus  continuous.  For  Lipschitz  continuous 
functions  one  can  quantify  how  much  change  in  the  v -values  can  be  allowed  to 
obtain  a  change  in  the  function  values  of  e  >  0  at  the  most: 

\xi-x2\  <e/L  =>  |/(xi)  -  f(x2)\  <  £. 

Occasionally  the  following  weaker  quantification  is  required. 

Definition  C.  15  A  function  /  :  D  M  is  called  uniformly  continuous,  if  there 

exists  a  mapping  oo  :  (0,  1]  — >  (0,  1]  :  e  oo(e)  such  that 

l*i  -  *2 1  <  uie)  ^  |/(*i)  -  /(* 2)1  <  e 

for  all  x\,  X2  6  D.  In  this  case  the  mapping  oo  is  called  a  modulus  of  continuity  of 
the  function  /. 

Every  Lipschitz  continuous  function  is  uniformly  continuous  (with  lo(s)  =  e/L), 
and  every  uniformly  continuous  function  is  continuous. 
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Example  C.16  (a)  The  quadratic  function  f(x)  =  x 2  is  Lipschitz  continuous  on 
every  bounded  interval  [a,  b].  For  x\  e  [a,  b]  we  have  \x\  \  <  M  =  max(|a|,  \b\) 
and  likewise  for  X2>  Thus 

\f(X\)  -  f(X2)  I  =  \x\  -x\\  =  \xi  +x2\\xi  x2\  <  2M\X\  x2\ 


holds  for  all  x\ ,  X2  c  [a,  b]. 

(b)  The  absolute  value  function  f(x)  =  \x\is  Lipschitz  continuous  on  D  =  R  (with 
Lipschitz  constant  L  =  1).  This  follows  from  the  inequality 


1*1 1  -  l*2l  <  1*1  -*2 


which  is  valid  for  all  x\ ,  X2  6  R. 

(c)  The  square  root  function  f(x)  =  *Jx  is  uniformly  continuous  on  the  interval 
[0,  1],  but  not  Lipschitz  continuous.  This  follows  from  the  inequality 


•\J V 1  \/*2 


which  is  proved  immediately  by  squaring.  Thus  uj(e)  =  e1  is  a  modulus  of  continuity 
of  the  square  root  function  on  the  interval  [0,  1].  The  square  root  function  is  not 
Lipschitz  continuous  on  [0,  1],  since  otherwise  the  choice  X2  =  0  would  imply  the 
relations 

1 

V*i  <  Ll*il>  —p=  <  L 

V*1 

which  cannot  hold  for  fixed  L  >  0  and  all  x\  e  (0,  1]. 

(d)  The  function  f(x)  =  \  is  continuous  on  the  interval  (0,  1),  but  not  uniformly 

A 

continuous.  Assume  that  we  could  find  a  modulus  of  continuity  e  u(e)  on  (0,  1). 
Then  for  x\  =  2 £uj(e),  X2  =  £cu(e)  and  e  <  1  we  would  get  \x\  —  X2\  <  w(£),  but 


1 

1 

V2  X\ 

euo(e) 

1 

*1 

*2 

X\X2 

2£2uj(£)2 

2£L u(£) 

which  becomes  arbitrarily  large  as  e  —>  0.  In  particular,  it  cannot  be  bounded  from 
above  by  e. 

From  the  mean  value  theorem  (Proposition  8.4)  it  follows  that  differentiable  func¬ 
tions  with  bounded  derivative  are  Lipschitz  continuous.  Further  it  can  be  shown  that 
every  function  which  is  continuous  on  a  closed,  bounded  interval  [a,  b]  is  uniformly 
continuous  there.  The  proof  requires  further  tools  from  analysis  for  which  we  refer 
to  [4,  Theorem 3. 13]. 

Apart  from  the  intermediate  value  theorem,  the  fixed  point  theorem  is  an  important 
tool  for  proving  the  existence  of  solutions  of  equations.  Moreover  one  obtains  an 
iterative  algorithm  for  approximating  the  fixed  point. 
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Definition  C.17  A  Lipschitz  continuous  mapping  /  of  an  interval  I  to  R  is  called  a 
contraction ,  if  /(/)  C  I  and  /  has  a  Lipschitz  constant  L  <  l.A  point  x*  e  I  with 
x*  =  f(x*)  is  called of  the  function  /. 

Proposition  C.18  (Fixed  point  theorem)  A  contraction  f  on  a  closed  interval 
[a,  b ]  has  a  unique  fixed  point.  The  sequence,  recursively  defined  by  the  iteration 


Xn+ 1  =  f(.Xn) 


converges  to  the  fixed  point  x*  for  arbitrary  initial  values  x\  e  [a,  b]. 

Proof  Since  f  ([a,  b])  C  [a,  b]  we  must  have 

a  <  fia)  and  f(b)  <  b. 

If  a  =  f(a)  or  b  =  f(b ),  we  are  done.  Otherwise  the  intermediate  value  theorem 
applied  to  the  function  g(x)  =  x  —  f(x)  yields  the  existence  of  a  point  v*  e  (a,  b) 
with  g(x*)  =  0.  This  v*  is  a  fixed  point  of  /.  Due  to  the  contraction  property  the 
existence  of  a  further  fixed  point  y*  would  result  in 

\x*  ~  y*\  =  \f(x*)  -  f(y*)\  <  L|v*  -  y*|  <  \x*  -  y*\ 

which  is  impossible  for  v*  7^  y*.  Thus  the  fixed  point  is  unique. 

The  convergence  of  the  iteration  follows  from  the  inequalities 

I**  -  I  =  I  fix*)  -  f(xn)\  <  L|v*  -  jcw|  <  . . .  <  Ln\x*  -  x\  | , 

since  \x*  —  x\  \  <  b  —  a  and  Ln  =0.  □ 


Description  of  the  Supplementary 
Software 


D 


In  our  view  using  and  writing  software  forms  an  essential  component  of  an  analysis 
course  for  computer  scientists.  The  software  that  has  been  developed  for  this  book 
is  available  on  the  website 

http  s :  // w  w  w.  springer,  com/book/ 9783319911540 

This  site  contains  the  Java  applets  referred  to  in  the  text  as  well  as  some  source  files 
in  maple,  Python  and  MATLAB. 

For  the  execution  of  the  maple  and  MATLAB  programs  additional  licences  are 
needed. 

Java  applets.  The  available  applets  are  listed  in  Table  D.  1 .  The  applets  are  executable 
and  only  require  a  current  version  of  Java  installed. 

Source  codes  in  MATLAB  and  maple.  In  addition  to  the  Java  applets,  you  can  find 
maple  and  MATLAB  programs  on  this  website.  These  programs  are  numbered  accord¬ 
ing  to  the  individual  chapters  and  are  mainly  used  in  experiments  and  exercises.  To 
run  the  programs  the  corresponding  software  licence  is  required. 

Source  codes  in  Python.  For  each  MATLAB  program,  an  equivalent  Python  program 
is  provided.  To  run  these  programs,  a  current  version  of  Python  has  to  be  installed. 
We  do  not  specifically  refer  these  programs  in  the  text;  the  numbering  is  the  same  as 
for  the  M-files. 
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Table  D.1  List  of  available  Java  applets 
Sequences 

2D-visualisation  of  complex  functions 
3D-visualisation  of  complex  functions 
Bisection  method 

Animation  of  the  intermediate  value  theorem 
Newton’s  method 
Riemann  sums 
Integration 

Parametric  curves  in  the  plane 
Parametric  curves  in  space 
Surfaces  in  space 
Dynamical  systems  in  the  plane 
Dynamical  systems  in  space 
Linear  regression 
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C,  39 
N,  1 
No,  1 

Q,  1 

R,  4 
Z,  1 

e,  21,  61,  85,  168,323 
i,  39 
7 r,  3,  30 
V,  222 
oo,  7 

A 

Absolute  value,  7,  40 
function,  19 
Acceleration,  90 
vector,  192,  202 
Addition  theorems,  32,  42,  85 
Affine  function 
derivative,  84 
Analysis  of  variance,  261 
Angle,  between  vectors,  335 
ANOVA,  261 
Antiderivative,  140 
Approximation 

linear,  88,  89,  219 
quadratic,  224 
Arccosine,  33 
derivative,  95 
graph,  34 

Archimedean  spiral,  200 
Archimedes,  200 


Arc  length,  30,  196 
graph,  159 
parametrisation,  196 
Arcosh,  23 
derivative,  95 
Arcsine,  33 
derivative,  95 
graph,  33 
Arctangent,  34 
derivative,  95 
graph,  34 
Area 

sector,  74 

surface  of  sphere,  160 
triangle,  29 
under  a  graph,  150 
Area  element,  247 
Area  functions,  23 
Area  hyperbolic 
cosine,  23 
sine,  23 
tangent,  23 
Argument,  42 

Arithmetic  of  real  numbers,  56 
Arsinh,  23 

derivative,  95 
Artanh,  23 
derivative,  95 
Ascent,  steepest,  222 
Axial  moment,  253 

B 

Basis,  standard,  332 
Beam,  119 

Bijective,  see  function 
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Binomial  formula,  359 
Binormal  vector,  202 
Bisection  method,  77,  110,  114 
Bolzano,  B.,  64,  75 
Bolzano- Weiers  trass 
theorem  of,  64 
Box-dimension,  126 

C 

Cantor,  G.,  2 
set,  128 
Cardioid,  201 

parametric  representation,  201 
Cauchy,  A.L.,  357 
product,  358 
Cavalieri,  B.,  244 
Cavalieri’s  priciple,  244 
Centre  of  gravity,  248 
geometric,  249 
Chain  rule,  91,  219 
Characteristic  equation,  348 
Circle 

of  latitude,  238 
osculating,  198 

parametric  representation,  186 
unit,  30 
Circular  arc 
length,  195 
Clothoid,  198 

parametric  representation,  198 
Coastline,  126,  264 
Coefficient  of  determination,  263 
multiple,  267 
partial,  268 
Column  vector,  343 
Completeness,  2,  55 
Complex  conjugate,  40 
Complex  exponential  function,  42 
Complex  logarithm,  44,  45 
principal  branch,  45 
Complex  number,  39 
absolute  value,  40 
argument,  42 
conjugate,  40 
imaginary  part,  40 
modulus,  40 
polar  representation,  41 
real  part,  40 
Complex  plane,  41 
Complex  quadratic  function,  44 
Complex  root,  principal  value,  45 


Concavity,  108,  109 
Cone,  volume,  159 
Consumer  price  index,  117 
Continuity,  70,  212 
componentwise,  232 
Lipschitz,  194,  361 
uniform,  361 
Contraction,  363 
Convergence 
linear,  111 

Newton’s  method,  112 
order,  111 
quadratic,  111 
sequence,  53 
Convexity,  108,  109 
Coordinate  curve,  210,  236 
Coordinates 
of  a  point,  331 
polar,  35,  42 
Coordinate  system 
Cartesian,  331 
positively  oriented,  331 
right-handed,  331 
Coordinate  vector,  333 
Cosecant  function,  37 
Cosine,  28 

antiderivative,  142 
derivative,  85 
graph,  32 
hyperbolic,  22 
Cotangent,  28 
graph,  32 
Countability,  2 
Cuboid,  24 1 
Curvature 
curve,  196 
graph,  198 
Curve,  185 

algebraic,  188 
arc  length,  196 
ballistic,  186 
change  of  parameter,  1 87 
curvature,  196 
differentiable,  189 
figure  eight,  201 
in  the  plane,  185,  187 
length,  193,  194 
normal  vector,  191 
parameter,  185 
polar  coordinates,  200 
rectifiable,  193 
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reparametrisation,  187 
Curve  in  space,  202 
arc  length,  202 
binormal  vector,  202 
differentiable,  202 
length,  202 
normal  plane,  202 
normal  vector,  202 
rectifiable,  202 
Curve  sketching,  105,  109 
Cusp,  188 
Cycloid,  190 

parametric  representation,  190 
Cyclometric  functions,  33 
derivative,  95 

D 

Damping,  viscous,  290 
Density,  247 
Derivative,  83,  217 
affine  function,  84 
arccosine,  95 
arcosh,  95 
arcsine,  95 
arctangent,  95 
arsinh,  95 
artanh,  95 
complex,  133 
cosine,  84 

cyclometric  functions,  95 

directional,  221 

elementary  functions,  96 

exponential  function,  85,  94 

Frechet,  217,  233 

geometric  interpretation,  213 

higher,  87 

higher  partial,  215 

hyperbolic  functions,  95 

inverse  function,  93 

inverse  hyperbolic  functions,  95 

linearity,  90 

logarithm,  93 

numerical,  96 

of  a  real  function,  83 

partial,  212 

power  function,  94 

quadratic  function,  84 

root  function,  84 

second,  87 

sine,  85 

tangent,  91 


Determinant,  346 
Diagonal  matrix,  345 
Diffeomorphism,  249 
Difference  quotient,  82,  83 
accuracy,  171 
one-sided,  97,  98 
symmetric,  98,  99 
Differentiability 

componentwise,  232 
Differentiable,  83 
continuously,  215 
Frechet,  217 
nowhere,  86 
partially,  212 
Differential  equation 
autonomous,  288,  299 
blow  up,  284 

characteristic  equation,  290 
conserved  quantity,  309 
dependent  variable,  276 
direction  held,  277 
equilibrium,  289 
existence  of  solution,  284 
first  integral,  309 
first-order,  275 
homogeneous,  278,  292 
independent  variable,  276 
inhomogeneous,  278,  293 
initial  condition,  277 
initial  value  problem,  301 
invariant,  309 
linear,  278,  290 
linear  system,  298 
Lotka-Volterra,  298 
nonlinear  system,  298 
particular  solution,  282 
power  series,  286,  315 
qualitative  theory,  288 
saddle  point,  303 
second-order,  290 
separation  of  variables,  276 
solution,  275 
solution  curve,  301 
stationary  solution,  281,  289 
stiff,  325 
trajectory,  301 
uniqueness  of  solution,  285 
Differentiation,  83 
Differentiation  rules,  90 
chain  rule,  91 
inverse  function  rule,  93 
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product  rule,  90 
quotient  rule,  91 
Dimension 
box,  126 

experimentally,  126 
fractal,  126 

Directional  derivative,  221 
Direction  field,  277 
Dirichlet,  P.G.L.,  152 
function,  152 
Discretisation  error,  97 
Distribution 
Gumbel,  118 
lognormal,  118 
Domain,  14 
Double  integral,  243 

transformation  formula,  25 1 

E 

Eigenvalue,  348 
Eigenvector,  348 
Ellipse,  189 

parametric  representation,  1 89 
Ellipsoid,  228 
Energy 

conservation  of,  313 
kinetic,  313 
potential,  313 
total,  313 
Epicycloid,  201 
eps,  10 

Equilibrium,  289,  301 

asymptotically  stable,  289,  302 
stable,  302 
unstable,  302 
Equilibrium  point,  301 
Error  sum  of  squares,  262 
Euler,  L.,  21 
Euler  method 

explicit,  322,  327 
implicit,  325 
modified,  328 
stability,  325 
Euler’s  formulas,  43 
Euler’s  number,  21,  61,  85,  168,  323 
Exponential  function,  20,  57 
antiderivative,  142 
derivative,  85,  94 
series  representation,  360 
Taylor  polynomial,  168 
Exponential  integral,  143 


Extremum,  106,  109,  225 
local,  108,  109 
necessary  condition,  106 
Extremum  test,  170 

F 

Failure  wedge,  119 
Field,  40 
First  integral,  309 
Fixed  point,  120,  363 
Floor  function,  25 
Fractal,  124 
Fraction,  1 
Frechet,  M.,  216 
Free  fall,  8 1 
Fresnel,  A.J.,  143 
integral,  143,  199 
Fubini,  G.,  244 
Fubini’s  theorem,  244 
Function,  14 
affine,  218 
antiderivative,  140 
bijective,  2,  15 
complex,  44 

complex  exponential,  42 
complex  quadratic,  44 
composition,  91 
compound,  91 
concave,  108 
continuous,  70,  212 
convex,  108 
cyclometric,  33 
derivative,  83 
differentiable,  83 
elementary,  143 
exponential,  57 
floor,  25 
graph,  14,  209 
higher  transcendental,  143 
hyperbolic,  22 
image,  14 
injective,  14 
inverse,  16 

inverse  hyperbolic,  23 
linear,  17 

linear  approximation,  89 
monotonically  decreasing,  107 
monotonically  increasing,  107 
noisy,  99 

nowhere  differentiable,  86 
piecewise  continuous,  153 
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quadratic,  14,  18,  218 
range,  14 
real- valued,  14 
slope,  107 

strictly  monotonically  increasing,  107 

surjective,  15 

trigonometric,  27,  44 

vector  valued,  23 1 

zero,  75 

Fundamental  theorem 
of  algebra,  41 
of  calculus,  156 


G 

Galilei,  Galileo,  8 1 
Galton,  F.,  257 
Gauss,  C.F.,  115,257 
Gaussian  error  function,  143 
Gaussian  filter,  101 
Gradient,  221,  232 

geometric  interpretation,  222 
Graph,  14,  209 

tangent  plane,  220 
Grid 

mesh  size,  242 
rectangular,  241 
Grid  points,  175 


H 

Half  life,  280 
Half  ray,  189 
Heat  equation,  228 
Helix,  203 

parametric  form,  203 
Hesse,  L.O.,  224 
Hessian  matrix,  224 
Hyperbola,  190 

parametric  representation,  190 
Hyperbolic 
cosine,  22 
function,  22 
sine,  22 
spiral,  200 
tangent,  22 

Hyperbolic  functions,  22 
derivative,  95 
Hyperboloid,  228 


I 

Image,  14 

Imaginary  part,  40 

Indicator  function,  20,  245 

Inequality,  7 

INF,  9 

Inhmum,  52 

Infinity,  7 

Inflection  point,  109 
Initial  value  problem,  277,  301 
Injective,  see  function 
Integrable,  Riemann,  151,  243 
Integral 

definite,  149,  151 
double,  241,  243 
elementary  function,  142 
indefinite,  140 
iterated,  243 
properties,  154 
Riemann,  149 
Integration 
by  parts,  144 
numerical,  175 
rules  of,  143 
substitution,  144 
symbolic,  143 
Taylor  series,  172 
Integration  variable,  154 
Intermediate  value  theorem,  75 
Interval,  6 
closed,  6 
half-open,  6 
improper,  7 
open,  6 

Interval  bisection,  75 
Inverse  function  rule,  93 
Inverse  hyperbolic 
cosine,  23 
sine,  23 
tangent,  23 

Inverse  hyperbolic  functions,  23 
derivative,  95 
Inverse,  of  a  matrix,  346 
Iterated  integral,  243 
Iteration  method,  363 


J 

Jacobian,  217,  232 
Jacobi,  C.G.J.,  217 
Jordan,  C.,  347 
Julia,  G.,  131 
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set,  131 

Jump  discontinuity,  71,  72 

K 

Koch,  H.  von,  129 

Koch’s  snowflake,  129,  136,  193 

L 

Lagrange,  J.L.,  166 
Lateral  surface  area 

solid  of  revolution,  160 
Law  of  cosines,  36 
Law  of  sines,  36 
Least  squares  method,  256 
Leibniz,  G.,  153 
Lemniscate,  201 

parametric  representation,  201 
Length 

circular  arc,  195 
differentiable  curve,  194 
differentiable  curve  in  space,  202 
Leontief,  W.,  318 
Level  curve,  210 
Limit 

computation  with  Taylor  series,  171 
improper,  54 
inferior,  64 
left-hand,  70 
of  a  function,  70 
of  a  sequence,  53 
of  a  sequence  of  functions,  354 
right-hand,  70 
superior,  64 
trigonometric,  74 
Limit  function,  354 
Lindemayer,  A.,  134 
Linear  approximation,  88,  89,  165,  219 
Line  of  best  fit,  115,  256 
through  origin,  115,  116 
Line,  parametric  representation,  189 
Liouville,  J.,  143 
Lipschitz,  R.,  284 
condition,  285 
constant,  285,  361 
continuous,  361 
Lissajous,  J.A.,  204 
figure,  204 
Little  apple  man,  131 
Logarithm,  21 
derivative,  93 


natural,  21 
Logarithmic 
integral,  143 
spiral,  200 
Loop,  200 

parametric  representation,  200 
Lotka,  A.J.,  298 
Lotka-Volterra  model,  308,  327 
L-system,  135 

M 

Machine  accuracy 
relative,  10,  12 
Malthus,  T.R.,  281 
Mandelbrot,  B.,  130 
set,  130 
Mantissa,  8 
Mapping,  2,  14 
linear,  344 
Mass,  247 

Mass-spring-damper  system,  290 
Mass-spring  system,  nonlinear,  319 
Matrix,  343 

coefficient,  343 

determinant,  346 

diagonal  element,  345 

element,  343 

entry,  343 

inverse,  346 

invertible,  346 

Jordan  canonical  form,  347 

nilpotent,  350 

product,  345 

product  with  vector,  344 

regular,  346 

similar,  347 

square,  343 

transposed,  344 

unit,  345 

zero,  345 

Matrix  algebra,  343 
Maximum,  52 
global,  105 
isolated  local,  227 
local,  106,  108,  170,  224 
strict,  106 

Mean  value  theorem,  107 
Measurable,  245 
Meridian,  238 
Minimum,  52 
global,  106 
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isolated  local,  227 
local,  106,  170,  224 
Mobilised  cohesion,  119 
Model,  linear,  256,  265 
Modulus,  40 

Modulus  of  continuity,  361 
Moment 

of  inertia,  119 
statical,  248 

Monotonically  decreasing,  107 
Monotonically  increasing,  107 
Moving  frame,  192,  202 
Multi-step  method,  323 

N 

NaN,  9 

Neighbourhood,  53,  124 
Neil,  W.,  188 
Newton,  I.,  110,  294 
Newton’s  method,  111,  114,  119 

in  C,  133 

local  quadratic  convergence,  235 
two  variables,  233 
Nonstandard  analysis,  154 
Normal  domain,  246 
of  type  I,  246 
of  type  II,  246 
Normal  equations,  258 
Numbers,  1 
complex,  39 
decimal,  3 
floating  point,  8 
largest,  9 
normalised,  9 
smallest,  9 
integer,  1 
irrational,  4 
natural,  1 
random,  100 
rational,  1 
real,  4 

transcendental,  3 
Numerical  differentiation,  96 

O 

Optimisation  problem,  109 
Orbit,  periodic,  310 
Order  relation,  5 
properties,  6 
rules  of  computation,  6 


Osculating  circle,  198 

P 

Parabola 
Neil’s,  188 
quadratic,  18 
Paraboloid 
elliptic,  211 
hyperbolic,  210 
Partial  mapping,  210 
Partial  sum,  58 
Partition,  151 
equidistant,  153 
Peano,  G.,  284 

Pendulum,  mathematical,  312,  314 
Plane 

in  space,  339 
intercept,  339 
normal  vector,  340 
parametric  representation,  339 
slope,  339 
Plant 

growth,  136 
random,  138 
Point  of  expansion,  167 
Point  space,  334 
Polar  coordinates,  233 
Population  model,  281 
discrete,  5 1 
Malthusian,  281 
Verhulst,  51,  66,  281 
Position  vector,  333 
Power  function,  18 
antiderivative,  142 
derivative,  94 

Power  series,  equating  coefficients,  287 
Precision 
double,  8 
single,  8 

Predator-prey  model,  298 
Principal  value 
argument,  42 
Product  rule,  90 
Proper  range,  14 
Pythagoras,  27 
theorem,  27 

Q 

Quadratic  function 
derivative,  84 
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graph,  18 

Quadrature  formula,  111 
efficiency,  180 
error,  181 
Gaussian,  180 
nodes,  111 
order,  178 

order  conditions,  179 
order  reduction,  1 82 
Simpson  rule,  111 
stages,  111 
trapezoidal  rule,  176 
weights,  111 
Quotient  rule,  91 

R 

Radian,  30 

Radioactive  decay,  24,  280 
Rate  of  change,  89,  280 
Ratio  test,  67 
Real  part,  40 
Rectifiable,  193 
Regression 

change  point,  273 
exponential  model,  257,  273 
linear,  255 
loglinear,  257 
multiple  linear,  265 
multivariate  linear,  265 
simple  linear,  256 
univariate  linear,  256 
Regression  line,  256 
predicted,  259 
through  origin,  115 
Regression  parabola,  120 
Regression  sum  of  squares,  261 
Remainder  term,  166 
Residual,  259 
Riccati,  J.F.,  287 
equation,  287,  328 
Riemann,  B.,  149 
integrable,  151,  243 
integral,  151 
sum,  151,  242 
Right-hand  rule,  331 
Root,  complex,  41,  43 
Root  function,  19 
derivative,  84 
Rounding,  10 
Rounding  error,  97 
Row  vector,  344 


Rules  of  calculation 
for  limits,  53 

Runge-Kutta  method,  323 

S 

Saddle  point,  225,  227 
Saddle  surface,  210 
Scalar  multiplication,  332 
Scatter  plot,  115,  255 
Schwarz,  H.A.,  216 
theorem,  216 
Secant,  82 
slope,  83 

Secant  function,  37 
Secant  method,  115 
Self- similarity,  123 
Semi-logarithmic,  111 
Sequence,  49 

accumulation  point,  62 
bounded  from  above,  5 1 
bounded  from  below,  52 
complex- valued,  50 
convergent,  53 
geometric,  55 
graph,  50 
infinite,  49 
limit,  53 

monotonically  decreasing,  5 1 
monotonically  increasing,  5 1 
real-valued,  50 
recursively  defined,  50 
settling,  53 

vector- valued,  50,  211 
convergence,  211 
Sequence  of  functions 

pointwise  convergent,  354 
uniformly  convergent,  355 
Series,  58 

absolutely  convergent,  357 
Cauchy  product,  358 
comparison  criteria,  60 
convergent,  58 
divergent,  58 
geometric,  59 
harmonic,  60 
infinite,  58 
partial  sum,  58 
ratio  test,  67 
Set 

boundary,  124 
boundary  point,  124 
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bounded,  124 
Cantor,  128 
cardinality,  2 
closed,  124 
covering,  125 
interior  point,  124 
Julia,  131 
Mandelbrot,  130 
of  measure  zero,  245 
open,  124 
Sexagesimal,  5 
Shape  function,  256 
Sign  function,  19 
Simpson,  T.,  177 
rule,  177 
Sine,  28 

antiderivative,  142 
derivative,  84 
graph,  32 
hyperbolic,  22 
Taylor  polynomial,  168 
Taylor  series,  170 
Sine  integral,  143 
Snowflake,  129 
Solid  of  revolution 

lateral  surface  area,  160 
volume,  158 

Space-time  diagram,  300 
Sphere,  237 

surface  area,  160 
Spiral,  200 

Archimedean,  200 
hyperbolic,  200 
logarithmic,  200 
parametric  representation,  200 
Spline,  101 
Spring,  stiffness,  290 
Square  of  the  error,  116 
Standard  basis,  332 
Stationary  point,  106,  225 
Step  size,  322 
Straight  line 
equation,  337 
in  space,  340 
intercept,  17,  338 
normal  vector,  339 
parametric  representation,  338 
slope,  17,  29,  338 
Subsequence,  62 
Substitution,  144 
Superposition  principle,  278,  292 


Supremum,  5 1 
Surface 

in  space,  210 
normal  vector,  237 
of  rotation,  237 
parametric,  236 
regular  parametric,  236 
tangent  plane,  237 
tangent  vector,  213 
Surjective,  see  function 
Symmetry,  99 

T 

Tangent,  28 

graph,  32,  82,  87 
hyperbolic,  22 
plane,  220 
problem,  82 
slope,  87 
vector,  191,  202 
Taylor,  B.,  165 
expansion,  97 
formula,  165,  223 
polynomial,  167 
series,  169 
theorem,  170 
Telescopic  sum,  60 
Thales  of  Miletus,  28 
theorem,  28 
Total  variability,  261 
Transformation  formula,  25 1 
Transport  equation,  228 
Transpose 

of  a  matrix,  344 
Trapezoidal  rule,  176 
Triangle 
area,  29 
hypotenuse,  27 
inequality,  11 
leg,  27 

right-angled,  27 
Triangle  inequality,  195 
Trigonometric  functions,  27,  28 
addition  theorems,  32,  36 
inverse,  33 
Triple  product,  347 
Truncated  cone 
surface  area,  37 
surface  line,  37 
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U 

Uniform 

continuity,  361 
convergence,  355 
Unit  circle,  30,  43 
Unit  matrix,  345 
Unit  vector,  332 

V 

Variability 

partitioning,  262 
sequential,  268 
total,  261 

Variation  of  constants,  28 1 
Vector,  332 

cross  product,  336 
dot  product,  335 
inner  product,  335 
magnitude,  332 
norm,  332 
orthogonal,  335 
perpendicular,  335 
unit,  332 


zero,  332 

Vector  algebra,  331 
Vector  field,  23 1 
Vector  space,  50,  334 
Velocity,  89 
average,  81 
instantaneous,  82,  89 
Velocity  vector,  191,  202 
Verhulst,  P.-F.,  51,  66,  281,  289 
Vertical  throw,  141 
Volterra,  V.,  298 
Volume 
cone,  159 

solid  of  revolution,  158 

W 

Weber-Fechner  law,  24 
Weierstrass,  K.,  64 

Z 

Zero  matrix,  345 
Zero  sequence,  69 
Zero  vector,  332 


